Assumptions & Ground Rules

The purpose of this second assignment is to introduce you to working in RMarkdown. This will be the primary file format in which you will save and present your work for this class.

As noted previously, for this and all future assignments, you MUST type all commands in by hand. Do not copy & paste except for troubleshooting purposes (i.e., if you cannot figure out what you mistyped).

  • Early on, you may have a lot of trouble getting your code to run due to minor typos. This is normal.

  • Remember, you are learning to read and write a new (coding) language. As with learning any new languages, we learn from practice - and from correcting our mistakes.

Part 1 (Assignment 2.1)

Goal: Create new R Markdown file.

(Note: Remember that, when following instructions, always substitute “LastName” for your own last name and substitute YEAR_MO_DY for the actual date. E.g., Day_CRM495_RAssign2_2022_01_21)

In the first assignment, you learned about writing and running R code in an R Script file. You also saw how running certain commands (e.g., ggplot) from an R Script file will generate results in the RStudio Console. For Assignment 2, you will learn to create a new R Markdown file, and you will complete the remainder of your assignment in that file.

  • Like an R Script file, an R Markdown file can be used to write and run R code. However, an R Markdown file also can do much more than that. For instance, you can write and edit text, write and run R code, and generate statistical results and plots directly in the RMarkdown file. You can even create entire books and webpages using R Markdown. In fact, both this assignment and the first one were created using R Markdown.

  • R Markdown is an essential tool for producing reproducible research because, with it, we can thoroughly document and simultaneously provide detailed explanations for all of our coding decisions in a project - from opening and manipulating data, to recoding and combining variables, to summarizing and analyzing data, to creating and modifying figures.

  • We will start by simply opening and saving a new R Markdown file. For more detailed instructions, check out Danielle Navarro’s video series on using R Markdown.

  1. In RStudio, go to File > New File > R Markdown to open a new R Markdown document.
Opening a new R Markdown file

Opening a new R Markdown file

  1. The dialogue box asks for a Title, an Author, and a Default Output Format for your new R Markdown file.

    1. In the Title box, enter CRM 495: R Assignment 2.
    2. In the Author box, enter your First and Last Name (e.g., Jake Day).
    3. Under Default Output Format box, leave it as HTML which is usually the default selection.
  2. Click OK to create your new R Markdown file. It should look like this:

    Your new R Markdown file

    Your new R Markdown file

  3. The new R Markdown file contains a simple pre-populated template to show users how to do basic tasks like add settings, create text headings and text, insert R code chunks, and create plots. I recommend you read through the template the first time you open a new RMarkdown file as it contains useful information. However, it can also be a little overwhelming for new users. So we are going to delete everything after the metadata and second set of three dashes (i.e., after the YAML header).

    Keep the YAML header; delete the template

    Keep the YAML header; delete the template

  4. Familiarize yourself with R Markdown by adding some headers, text, and R code chunk.

    1. Hit <Enter> to leave a blank line between the header and the first line of text.
    2. On the next line (line 8), type: ## Part 1 (Assignment 2.1)
      • In the markdown document, two hash marks specifies a second-level text heading.
      • Note: This is different from the R Script file, which only contains R code. Recall, a hash mark transforms code into a comment that is not evaluated or run in an R Script file (and in an R code chunk in R Markdown, which you will learn about soon).
      • Also note that a space is required between the two hash marks and your text. If you do not include the space, R Markdown will not recognize it as a text heading (if you want to see for yourself, don’t include the space between the hashmarks and try knitting the file).
    3. Hit<Enter>and, on the next line (line 9), type: ### Learning R Markdown
      • Three hash marks specifies a third-level text heading.
    4. Hit<Enter>to leave another blank line (line 10).
    5. On the next line (line 11), type the following sentence: “This R Markdown document contains my work for Assignment 2. It is my work and only my work.”
      • To italicize, place a single asterisk (*) before and after the word or text segment.
      • To bold, place two asterisks (**) before and after the word or text segment.
      • To bold and italicize, place three asterisks (***) before and after the word or text segment.
      • There are a lot of sources online that explain various formatting options in R Markdown. For examples, check out here, here, here, here, and here. Also, check out Nicholas Tierney’s bookdown for descriptions of and solutions to common problems with RMarkdown.
  5. Before typing anything else, save your new R Markdown file in your “LastName_CRM495_work” folder. Name the file: LastName_CRM495_RAssign2_RMD_YEAR_MO_DY

  6. Your RStudio session should now look similar to this:

    Your first R Markdown file

    Your first R Markdown file

  7. Ready for one of the best parts of R Markdown? No more pasting screenshots into Word! Instead, you are going to use the “knit” button at the top of your R Markdown file to automatically create a Word document capturing your current work.

    1. Click “knit” button
    2. An html document should pop up that looks a lot like this:
Your first knitted html document using R Markdown

Your first knitted html document using R Markdown

Part 2 (Assignment 2.1)

Goal: Edit YAML Header for TOC & Theme

In addition to meeting the various goals required to reproduce original research findings, you should also work on creating documents that are functionally organized, easily navigable, and aesthetically pleasing. We will make some solid strides toward this goal throughout the class. However, what we will cover will only scratch the surface of what you can do with R and R Markdown. For additional inspiration, I strongly recommend checking out Alison Hill’s excellent blog entry: “How I Teach R Markdown”.

In this section, we will highlight two simple modifications that you can make to the YAML header to improve the functionality and appearance of your final knitted document. Remember, the YAML header is the section of text that appears at the very top of your R Markdown file in between the two lines containing three dashes (“- - -”). When you create a new R Markdown file, the YAML header looks like this:

Default YAML header in new R Markdown file

Default YAML header in new R Markdown file

Here, we will modify the YAML header directly to change the appearance or Bootswatch theme, add a table of contents, and knit to different file formats.

The easiest way to get started with editing YAML headers is to simply copy others that work. So, for example, check out the YAML header used for this project assignment in the screenshot below:

YAML header used for this Project Assignment 2 HTML document

YAML header used for this Project Assignment 2 HTML document

There are several noteworthy things about this header in comparison to the default YAML header:

  1. Unlike the default, this header contains multiple layers of indentation. For instance, “html_document:” is indented one level, and everything from “number_sections:” to “code_folding” is indented two levels. - Indentation matters in YAML headers! If your document is not knitting the way you want or including things that you expected it to included, you may have messed up the indentation. - See here for more detailed information on how to properly structure a YAML header.

  2. If you want to automatically created numbered sections, change “number_sections: no” to “number_sections: yes”

  3. This assignment uses the “readable” theme. The default theme is - you guessed it - named “default.” There are several built-in themes that you can choose from (see the link for previews), and many more themes that you can install and use with some additional effort.

  4. The “toc: yes” line under “html_document” adds a table of contents to our knitted HTML document” - The “toc_depth: 4” line specifies that the table of contents should apply to four (sub)heading levels (e.g., # to ####). - The “toc_float: yes” line specifies that the table of contents should float (at the upper left) in the knitted HTML document. A floating table of contents is always visible when scrolling the document.

  5. Recall, if you set your R code chunks to echo = TRUE, then your code will be shared in the knitted document. The line “code_folding: hide” will hide your shared code. In other words, your code will still be shared in the knitted HTML document but readers can choose whether to click a toggle button to see your code.

Now you should have all the basics that you need to create, organize, and slightly customize your own knitted HTML file for your reproduction project!

  1. Change the theme from the default to something else. - Choose one of the built-in themes for your RMarkdown file by changing the YAML header.
    - Feel free to edit other characteristics of the theme to make your knitted html file look pretty. Just be mindful of proper YAML formatting (e.g. indenting) as mentioned above. - Here is how I changed the YAML header by choosing the “journal” theme.
Edit Theme in YAML

Edit Theme in YAML

Part 3 (Assignment 2.3)

Goal: Add a code chunk and create an object within RMarkdown.

Now you have the basics of writing text in an RMarkdown document. But remember, one of the benefits of RMarkdown, is you can integrate your text with R code. You already know how to run code in an R Script. It is the same process in R Markdown, except you need to add an “R code chunk” into your file.

  1. Create a second-level header in R Markdown (hereafter, “RMD”) file titled: “Load Libraries”

  2. Insert an R chunk

    1. Click “Code > Insert chunk” or click the “insert code chunk” button (green box with a “C” in it) and select “R” opton (see below).
    2. Note that on a Windows computer, you can also insert a code chunk by pressing the “Ctrl”, “Alt”, “I” keys simultaneously (on a MAC it’s “Command”, “Option”, “I”).
      Insert R code chunk

      Insert R code chunk

  3. Inside the new R code chunk, load the same packages that you did in Assignment 1: library(tidyverse)

Recall, you only need to install packages one time. However, you must load them each time you start a new R session.

  1. After your first code chunk, create another second-level header in RMD titled: “Create Data Object”

  2. Insert another R code chunk

  3. In the new R code chunk, assign create a data object called arrest.data using another built-in data set with R called USArrests.

arrests.data <- USArrests
  • In the above code the text arrests.data is the name we are giving to the data object we are creating.
  • the <- command tells R to place whatever follows the text arrow (in this case, the built-in dataset “USArrests”) into the arrests.data object.
    • Note: We can name the object whatever we want (e.g., mydata). However, it is good coding practice to be systematic in creating concise yet informative names (this is somewhat redundant in this example as the built-in data set already had an informative name).
    • Also, R code is case-specific, so be careful and consistent with using upper- and lower-case letters!
    • By this point, you may be wondering why I care so much about how you name your folders and files. The short answer is that thoughtful and systematic naming conventions can save you a lot of time and can be very helpful to your future self or to others attempting to reproduce your work. In contrast, ad hoc names can cause a great deal of unnecessary frustration. I recommend Danielle Navarro’s videos on Names machines like and Names humans like (we’ll cover this in more detail in a future R assignment).
  1. In the same code chunk on a new line below your assign object command, type head(arrests.data).
    • This will print out the first six rows of the arrests.data object. If you wanted to print out more than the first six rows, you could simply add a comma and then type the number of rows you wanted to output (e.g., head(arrests.data, 10)). This can be useful when you perform some operation on the data (e.g., create a new variable) and want to get a quick sense of whether it worked how you intended (e.g., actually created that new variable). If you wanted to print the entire data set, you would simply type print(arrests.data).
  2. Your R studio session should now look a lot like this:
Create a data object in R

Create a data object in R

Part 4 (Assignment 2.4)

Goal: Create a plot and write analytical notes within RMarkdown.

Now that you have created an object, you can use and/or modify that object in a variety of ways. For now, we’ll simply create a plot like we did in the previous assignment. If you’ve followed along to this point, you’ll know that the arrests.data object has 50 observations (one for each state) and four variables (“Murder”, “Assault”, “UrbanPop”, and “Rape”). For the crime types, the values represent the rate of that particular crime per 100,000 in the population. For the “UrbanPop” variable, the values represent the percent of people in that state classified as living in an urban location.

  1. Create a second-level header titled: “Create Plot”.

  2. Create plot of relationship between percent urban population and murder rate. - Create an R Code chunk and type the following code:

ggplot(data = arrests.data, 
       mapping = aes(x = UrbanPop, y = Murder)
) + 
  geom_point() + 
  geom_smooth(method = lm)

  1. Under the plot, write a brief note regarding your substantive interpretation of the plot (e.g., what do the results suggest about the relationship between urbanicity and murder?).

  2. Create another second-level header titled: “Create Another Plot”.

  3. Create a plot of the relationship between urban population and Assault. - I’m going to let you figure this out on your own by following the logic of the Murder plot above.

  4. Under this plot, write a brief note regarding your substantive interpretation of the plot.

  5. Create another second-level header titled: “Conclusion”. - In the conclusion write a brief statement about what you learned in this assignment and any problems or issues you had in completing it.

Part 5 (Assignment 2.4)

Goal: Submit your assignment

  1. Upon completing the tasks in the previous four sections, “knit” your final RMD file again and save the final knitted html document to your “Assignments” folder in your LastName_CRM495_work folder as: LastName_CRM495_RAssign2_YEAR_MO_DY.

  2. Inside the “LastName_CRM495_commit” folder in our shared folder, create another folder named: Assignment 2.

  3. To submit your assignment for grading, save copies of both your (1) “RAssign2” html file and (2) your “RAssign2_RMD file” into the LastName_CRM495_commit > Assignment 2 folder. Remember, be sure to save copies of both files - do not just drag the files over from your “work” folder, or you may lose those original copies from your “work” folder.

  4. Finally, submit your knitted html document on Canvas in the “R Assignment 2” submission portal. This will allow me to have a time-stamped version of your assignment for grading purposes.