Assumptions & Ground Rules

Since you have already completed Project Assignment #1, I assume you have found an article on a topic of interest to you that: (1) has data available online via ICSPR (or another repository); (2) that you have downloaded the data already - and, if the data are on ICPSR, that you have done so with reproducible R code using the icpsrdata package (if so, include this script in your code for this assignment); and (3) that the article contains basic descriptive findings reported in a table and/or a figure that you can reproduce using the available data.

Moreover, since you have already completed R Assignments #1 through #4, I assume that you are familiar with: (1) RStudio; (2) creating organized, descriptive, and reproducible R Markdown (RMD) files; (3) (re)producing tables in R using the gt package; and (4) (re)producing basic figures using ggplot2 package. If not, please review R Assignments 1-4.

Project Assignment #2

Goal: Start Drawing the Owl - Describe Reproduction; Get & Read Data; Summarize Raw Variables

Unlike your previous assignments, this assignment will not be organized into various numbered “parts” for you to follow and complete, and you should not organize it as a numbered list by following the numbered items below. Rather, this is your chance to demonstrate your own creativity and show what you have learned by crafting your own RMD file as you see fit. The final knitted file should contain the following (again, not numbered like below):

  1. A description of the article, data source, and specific findings that will be reproduced, along with a justification for the reproduction. As with the rest of the document, this should be professionally written - think of it like the introductory section of a published replication/reproduction article that must describe the original research and justify the replication/reproduction research.

  2. The table(s) and/or figure(s) found in the original study that you plan to reproduce should be included as an image(s). You have not yet learned how to do this in R Markdown, so I will give a brief and very basic tutorial below.

  3. Following your description/justification of the reproduction, you should move into the reproduction itself. The first step should be to download the data and explain the process in sufficient detail so that others can easily and accurately reproduce your work.

    • If you are downloading data from ICPSR, do it using reproducible R code with the icpsrdata package. With this approach, you should also add R code that automatically creates a unique data folder (e.g., “Project_Data”) for this project as a subfolder in a working directory created specifically for this project (e.g., “Project_work/Project_Data”) before downloading, then use icpsrdata package to download to your new folder. Recall, this process ensures that anyone who runs your RMD file on their own computer will automatically create the appropriate subfolder and then download the data to the correct folder as well. (If you do not remember how to do this, go back and review your earlier R Assignments.)
    • If the data are not on ICPSR, you should download it manually and save it in a unique data subfolder (e.g., “Project_Data”) within a working directory created specifically for this project (e.g., “Project_work/Project_Data”). Then, you should describe the exact procedures that others will need to follow to be able to reproduce your work. E.g., you will need to describe where and how to download the data and then where to save it (e.g., in a specifically-titled subfolder within the working directory containing your RMD file). If you were actually going to publish this and share supplementary materials, you would likely share a “zip” file containing your project’s working directory with the RMD file in it as well as the datafile saved within the necessary pre-titled subfolder (and you will be doing exactly this for Project Assignment #3). Still, it is important to provide detailed and careful descriptions of these procedures to ensure that others can easily and accurately reproduce your work.
  • Again, the point here is not to treat this like an assignment; rather, you should treat it like you are drafting an article in R Markdown, complete with code and detailed descriptions throughout that you would also submit as supplementary materials to accompany the trimmed-down and polished published article.
  1. Read the data into R and save it as an object.

  2. Identify and describe all the variables (i.e., “key variables”) needed to reproduce the original article’s table(s) or figure(s).

  3. Summarize the raw versions of the key variables or items (e.g., attributes/response labels; summary/descriptive statistics) using packages and commands that you learned in earlier assignments.

    • There are various ways to explore your data and view response labels (e.g., recall attr(labels)?).
    • For descriptive statistics, we recommend you use the sjmisc package, which you learned about in R Assignment 4. Recall, using that package and tidyverse, you can easily generate frequency and descriptive statistics tables (e.g., mydata %>% frq(myvar1) or mydata %>% descr(myvar1, myvar2)).
    • Note: For this assignment, you are not required to recode all the variables (e.g., using mutate) if that is needed for reproduction or to create polished versions of descriptives tables (e.g., with gt package) or figures (e.g., with ggplot2 package), though you should feel free to do so if desired. You will be required to do those things in your completed “first draft” to submit for peer review in the next assignment (Project Assignment #3).

The final document should be well-organized using leveled subheadings (e.g., ## Top Heading, ### Subheading 1, etc.) with procedures thoroughly described in text (i.e., not in headings). Additionally, I recommend adding a table of contents (or perhaps a floating table of contents) as well, which you can do by modifying the YAML header at the top of your R Markdown file. Finally, you might wish to select your own theme to personalize the aesthetic look of your final knitted document, which you can also do by modifying the YAML header. You have not yet learned how to do this in R Markdown either, so I will give a brief and very basic tutorial on this as well below.

Upon completing the assignment, “knit” your final RMD file and save the final knitted HTML document to your “Assignments” folder in your LastName_P680_work folder as: LastName_P680_ProjAssign2_YEAR_MO_DY. - Inside the “LastName_P680_commit” folder in our shared folder, create another folder named: Project Assignment 2.
- To submit your assignment for grading, save copies of both your (1) “ProjAssign2” HTML file and (2) your “ProjAssign2_RMD file” into the LastName_P680_commit > Project Assignment 2 folder.

Brief Tutorial #1: Save & Load Simple Image in RMD

Getting a basic image into R Markdown is relatively easy. The steps involve: (1) creating an “Images” subfolder within your working directory (i.e., “work” folder); (2) getting the image you want; (3) editing the image and saving it in your new “Images” subfolder within your working directory; (4) writing a code chunk to load the image in R Markdown. We will briefly go through each of these steps below.

  1. Before getting your image from the published article you selected, you will need a directory in which you can save the image and from which you can load the image in R Markdown. Once you have a subfolder within your working directory, you can easily load an image within this folder using an R code chunk in R Markdown; in fact, the process is quite similar to saving and loading a dataset from your “Datasets” subfolder, which you have done several times by now. However, rather than using your “Datasets” subfolder, we recommend instead creating a new directory (i.e., folder) entitled “Images” within your root “work” folder. Recall, if you opened R Studio directly from an RMD file in your working directory (i.e., in your “work” folder), then you can easily add an “Images” subfolder to your “work” folder with an R code chunk, using the R code below (Note: We used something called “code folding” here - more on that later; simply click the “Code” button to see it):
# check if "Images" folder exists (TRUE if it does) & create if it does not exist. 
ifelse(dir.exists(here("Images")), TRUE, dir.create(here("Images")))
## [1] TRUE
  1. We assume you know how to take and crop a screenshot, or to copy-and-paste, the image(s) or table(s) that you wish to reproduce from the original published article that you have chosen.

  2. As for editing the image, one easy way to do this is simply to paste the image into a new PowerPoint slide. From here, you can edit the image however you wish, then PowerPoint’s “save as picture” option should result in a saved image that retains your edits.

  • When editing, I (Jon) like to use subtle effects such as the “center shadow rectangle,” which gives some depth to your image without being overly distracting.