+ - 0:00:00
Notes for current slide
Notes for next slide

Communicating with Data
via R Markdown



Reproducible Reports

Presented by Emi Tanaka

School of Mathematics and Statistics


dr.emi.tanaka@gmail.com @statsgen

4th October 2019 | COMBINE | Sydney, Australia

These slides are viewed best by Chrome and occasionally need to be refreshed if elements did not load properly. See here for PDF .

1/47

In a nutshell 🥜



R Markdown integrates text + code in one source document with ability to knit to many output formats (via Pandoc).

2/47

Text in Markdown

# Header 1
## Header 2
- Unordered list 1
- Unordered list 2
1. Ordered list 1
1. Ordered list 2
_This is italic._ *This too.*
__This is bold.__ **This too.**
_**This is bold & italic.**_

Output

Header 1

Header 2

  • Unordered list 1
  • Unordered list 2
  1. Ordered list 1
  2. Ordered list 2

This is italic. This too. This is bold. This too. This is bold & italic.

Go to RStudio > Help > Markdown Quick Reference
3/47

Shortcut for inserting code chunk

In RStudio .Rmd press

  • Mac: + + i
  • PC: Ctrl + Alt + i

to insert a chunk of R code

```{r}
```
4/47

Chunk options: echo & eval

```{r, echo = FALSE}
plot(speed ~ dist, cars)
```

```{r, eval = FALSE}
plot(speed ~ dist, cars)
```
plot(speed ~ dist, cars)
5/47

There are many more chunk options.

Can you name 5 other ones?

Hint: https://yihui.name/knitr/options/

(We'll explore some later.)

6/47

Valid chunk options

  • Chunk options must be written in one line, i.e. no line break.
  • All option values must be valid R expressions. Exception is the chunk name. E.g.
    • fig.path = figures/ is not valid but
      fig.path = "figures/" is valid
    • eval = true is not valid but
      eval = runif(1) > 0.5 is valid
7/47

Chunk names (or labels)

The chunk below is called plot1.

```{r plot1}
ggplot(cars, aes(dist, speed)) + geom_point()
```

All chunks have a label regardless of whether it is explicitly supplied or not.

Do not include spaces, "_" or punctuation marks in your chunk name!

8/47

Inline R Commands

Today's date is `r Sys.Date()`.

Today's date is 2019-10-03.

The value of $\pi$ is `r pi`.

The value of π is 3.1415927.

  • Note: the inline command needs to be R commands.
  • Inline command does not echo and always evaluates.
9/47

Go through

  • challenge-02.Rmd
  • challenge-03.Rmd
  • challenge-04.Rmd
  • challenge-05.Rmd
  • challenge-06.Rmd
25:00
10/47

R Markdown is not just for R

```{python, echo = FALSE}
a = [1, 2, 3]
a[0]
```
## 1
```{bash, echo = FALSE}
date +%B
```
## October
11/47

Can you name some other engines?

Hint: https://yihui.name/knitr/demo/engines/

12/47

YAML - YAML Ain't Markup Language

Basic format

---
key: value
---

Example

---
title: "Communicating with Data via R Markdown"
subtitle: "Reproducible Reports"
author: "Emi Tanaka"
date: "`r Sys.Date()`"
output: html_document
---

There must be a space after ":"!

13/47

Metadata

All YAML data are stored in rmarkdown::metadata as list.

rmarkdown::metadata$title
## [1] "Communicating with Data via R markdown"
rmarkdown::metadata$author
## [1] "Emi Tanaka"
14/47

Default (minimal) html output

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="author" content="Emi Tanaka" />
<meta name="date" content="2019-10-04" />
<title>Communicating with Data via R Markdown</title>
</head>
<body>
<h1 class="title toc-ignore">Communicating with Data via R Markdown</h1>
<h3 class="subtitle">Reproducible Reports</h3>
<h4 class="author">Emi Tanaka</h4>
<h4 class="date">2019-10-04</h4>
</body>
</html>

html meta data

Default html template add special YAML key values to file automatically

output

15/47

YAML structure

  • White spaces indicate structure in YAML - don't use tabs though!
  • Same as R, you can comment lines by starting with #.
  • YAML is case sensitive.
  • A key can hold multiple values.
key:
- value 1
- value 2
key: [value 1, value 2]
16/47

YAML with multiple key values

---
title: "Communicating with Data via R Markdown"
author:
- "Emi Tanaka"
- "Accomplice"
output: html_document
---
<body>
<h1 class="title toc-ignore">Communicating with Data via R Markdown</h1>
<h4 class="author">Emi Tanaka</h4>
<h4 class="author">Accomplice</h4>
</body>
output

17/47

key can contain keys

---
output:
html_document:
toc: true
toc_float: true
---
What does this do?

(Note: white space is important)

18/47

Values spanning multiple lines

---
title: >
this is a\
single line\
abstract: |
this value spans\
many lines and\
appears as it is\
output: pdf_document
---
`r rmarkdown::metadata$title`
`r rmarkdown::metadata$abstract`
output

19/47

Go through

challenge-07.Rmd

10:00
20/47

Parametrized Report

---
title: "Parameterized Report"
params:
species: setosa
output: html_document
---
```{r, message = FALSE, fig.dim = c(3,2)}
library(tidyverse)
iris %>%
filter(Species==params$species) %>%
ggplot(aes(Sepal.Length, Sepal.Width)) +
geom_point(aes(color=Species))
```
output

21/47

Knit with Parameters

---
title: "Parameterized Report"
params:
species:
label: "Species"
value: setosa
input: select
choices: [setosa, versicolor, virginica]
color: red
max:
label: "Maximum Sepal Width"
value: 4
input: slider
min: 4
max: 5
step: 0.1
output: html_document
---

```{r, message = params$printmsg, fig.dim = c(3,2)}
library(tidyverse)
iris %>%
filter(Species==params$species) %>%
filter(Sepal.Width < params$max) %>%
ggplot(aes(Sepal.Length, Sepal.Width)) +
geom_point(aes(color=Species),
color = params$color)
```
22/47

Shiny Report Generator

---
title: "Parameterized Report"
params:
species:
label: "Species"
value: setosa
input: select
choices: [setosa, versicolor, virginica]
color: red
max:
label: "Maximum Sepal Width"
value: 5
input: slider
min: 4
max: 5
step: 0.05
output: html_document
---

 

23/47

R Markdown via Command Line

demo-render.Rmd

---
title: "Parameterized Report"
params:
species: setosa
output: html_document
---
```{r, message = FALSE, fig.dim = c(3,2)}
library(tidyverse)
iris %>%
filter(Species==params$species) %>%
ggplot(aes(Sepal.Length, Sepal.Width)) +
geom_point(aes(color=Species))
```

You can knit this file via R command by using render function:

library(rmarkdown)
render("demo-render.Rmd")

You can overwrite the YAML values by supplying arguments to render:

library(rmarkdown)
render("demo-render.Rmd",
output_format = "pdf_document",
params = list(species = "virginica"))
24/47

Go through

challenge-08.Rmd and challenge-09.Rmd

10:00
25/47

Themes: html_document

You can change the look of the html document by specifying themes:

  • default default
  • cerulean cerulean
  • journal journal
  • flatly flatly
  • darkly darkly
  • readable readable
  • spacelab spacelab
  • united united
  • cosmo cosmo
  • lumen lumen
  • paper paper
  • sandstone sandstone
  • simplex simplex
  • yeti yeti
  • NULL null
output:
html_document:
theme: cerulean

These bootswatch themes attach the whole bootstrap library which makes your html file size larger.

26/47

prettydoc

prettydoc 📦 is a community contributed theme that is light-weight:

  • cayman cayman
  • tactile tactile
  • architect architect
  • leonids leonids
  • hpstr hpstr
output:
prettydoc::html_pretty:
theme: cayman

See more about it below:

https://prettydoc.statr.me/

27/47

rmdformats

rmdformats 📦 contains four built-in html formats:

  • readthedown readthedown
  • html_clean html_clean
  • html_docco html_docco
  • material material

You can use these formats by simply specifying the output in YAML as below:

output: rmdformats::readthedown

See more about it below:

https://github.com/juba/rmdformats

28/47

rticles - LaTeX Journal Article Templates

  • acm acm_article
  • acs acs_article
  • aea aea_article
  • agu agu_article
  • amq amq_article
  • ams ams_article
  • asa asa_article
  • biometrics biometrics_article
  • copernicus copernicus_article
  • elsevier elsevier_article
  • frontiers frontiers_article
  • ieee ieee_article
  • jss jss_article
  • mdpi mdpi_article
  • mnras mnras_article
  • peerj peerj_article
  • plos plos_article
  • pnas pnas_article
  • rjournal rjournal_article
  • rsos rsos_article
  • rss rss_article
  • sage sage_article
  • sim sim_article
  • springer springer_article
  • tf tf_article
Go to RStudio > File > New File > R Markdown ... > From Template
29/47

External Files in Templating

  • When using rticles, each journal usually require external files (e.g. cls or image files).
  • These external components are stored within the package.
  • If you are drafting an Rmd template with external components then you need to extract these to your folder first.

GUI

  • RStudio > File > New File > R Markdown ... > From Template

Command line

rmarkdown::draft("file.Rmd",
template = "biometrics_article",
package = "rticles")
30/47

More customisation needed?


Default templates for many output are found at


https://github.com/jgm/pandoc-templates


We'll go through the latex template.

31/47

I found this nice latex template online.

You can see it at main.pdf.

It was compiled from main.tex.

Find main.tex and main.pdf in demo folder.
32/47

How do I use this template so that I can write contents from an Rmd file instead?

33/47

Templating

We will use

---
output:
pdf_document:
template: main.tex
---
But nothing written in the body shows up in the output!
34/47

Templating

We will use

---
output:
pdf_document:
template: main.tex
---
But nothing written in the body shows up in the output!

You need to add $body$ in the latex template file where you want the body of the md file to appear.

34/47

Templating: few more tweaks

  • R Markdown needs a few more special tweaks before \begin{document} in latex template:
\IfFileExists{bookmark.sty}{\usepackage{bookmark}}{\usepackage{hyperref}}
$if(highlighting-macros)$
$highlighting-macros$
$endif$
  • These are minimum tweaks needed for a LaTeX template.
  • You can find common tweaks (including for beamer) at https://github.com/jgm/pandoc-templates
  • You can define your own tweaks but it is better practice to use the ones defined in pandoc template rather than trying to reinvent the wheel.
35/47

How pandoc template works: key

Rmd

---
title: "COMBINE 2019"
author: "Emi Tanaka"
output:
pdf_document:
template: "template.tex"
---

YAML meta data can be used by surrounding key with $.

template.tex

\documentclass{article}
\title{$title$}
\author{$author$}
\date{}
\begin{document}
\maketitle
\end{document}

COMBINE 2019
Emi Tanaka

36/47

How pandoc template works: if statements

Rmd

---
title: "COMBINE 2019"
author: "Emi Tanaka"
output:
pdf_document:
template: "template.tex"
---

Simple "if null statements".

template.tex

\documentclass[
$if(fontsize)$
$fontsize$,
$endif$
]{article}
\title{$title$}
\author{$author$}
\date{}
\begin{document}
\maketitle
\end{document}
37/47

How pandoc template works: accessing list

Rmd

---
title: "COMBINE 2019"
author:
- name: "Rachel Wang"
email: "rachel.wang@sydney.edu.au"
- name: "Connor Smith"
email: "connor.smith@sydney.edu.au"
output:
pdf_document:
template: "template.tex"
---

Here it will become

\author{Rachel Wang \and Connor Smith}

template.tex

\documentclass{article}
\title{$title$}
\author{
$for(author)$
$author.name$$sep$ \and
$endfor$
}
\date{}
\begin{document}
\maketitle
\end{document}
38/47

Go through

challenge-10.Rmd

05:00
39/47

Cross Reference

  • When you make a header via Rmd
    # Some Header
    an id is created automatically.
  • The id is created by replacing space with - and making it all lower case.
  • Now you can link to this header by [some text](#some-header).
  • Cross references work for both pdf and html outputs.
40/47

Direct Reference for html

  • For html output, you can also give a link directly to the relevant section.
  • E.g. open demo-header.html in the demo folder in a web browser.
  • Append say #chicken-data to the url. It should look like

    demo-header.html#chicken-data

  • It should have taken you to straight to the corresponding header.
41/47

User-defined id

  • You can define your own id by appending {#your-id}.
# Some header {#header1}
  • Now you can link to this header with the id header1.
  • Note there should be no space in the id name!
42/47

Bibliography

  • BibTeX citation style format is used to store references in .bib files.
  • Remember that you can get most BibTeX citation for R packages citation function. (Scroll below to see the BibTeX citation).
citation("xaringan")
##
## To cite package 'xaringan' in publications use:
##
## Yihui Xie (2019). xaringan: Presentation Ninja. R package
## version 0.9. https://CRAN.R-project.org/package=xaringan
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {xaringan: Presentation Ninja},
## author = {Yihui Xie},
## year = {2019},
## note = {R package version 0.9},
## url = {https://CRAN.R-project.org/package=xaringan},
## }
43/47

Citations

  • You can include BibTeX by specifying the bib file at YAML as:
bibliography: bibliography.bib

[@bibtex-key] (Author et al. 2019)

or

@bibtex-key Author et al. 2019

  • See demo-citation.Rmd in the demo folder.
44/47

R Markdown is such an indispensible tool for making documents, especially if you have plan to include statistical output.

How do you use (or plan to use)
R Markdown?

45/47

People that made R Markdown possible

The development of R Markdown is largely thanks to

  • Yihui Xie
    Software Engineer at RStudio
    for knitr
  • John MacFarlane
    Professor of Philosophy at UC Berkeley
    for pandoc
  • and many contributors behind the development of these tools.
46/47

That's it!

These slides are made using xaringan R-package.

The workshop materials are licensed under .



Emi Tanaka

dr.emi.tanaka@gmail.com
@statsgen

Check that you can:

  • Understand how YAML changes .Rmd
  • Understand how to manipulate chunks in .Rmd
  • Understand how to change the template in .Rmd
  • Understand how to cross-reference and do citations
47/47

In a nutshell 🥜



R Markdown integrates text + code in one source document with ability to knit to many output formats (via Pandoc).

2/47
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow