In this phase, you will complete a rough draft of your replication and reproducibility project.

Unlike your previous assignments, this assignment will not be organized into various numbered “parts” for you to follow and complete, and you should not organize it as a numbered list by following the numbered items below. Rather, this is your chance to demonstrate your own creativity and show what you have learned by crafting your own RMD file as you see fit. The final document should be well-organized using leveled subheadings (e.g., ## Top Heading, ### Subheading 1, etc.) with procedures thoroughly described in text (i.e., not in headings). Additionally, I recommend adding a table of contents (or perhaps a floating table of contents) as well, which you can do by modifying the YAML header at the top of your R Markdown file. Finally, you might wish to select your own theme to personalize the aesthetic look of your final knitted document, which you can also do by modifying the YAML header. You have not yet learned how to do this in R Markdown either, so I will give a brief and very basic tutorial on this as well below.

You have already drafted some of the following parts of this assignment in previous Project Assignments (e.g. Project Assignments #3 and #4). However, you should add and expand on those parts where appropriate (e.g., description of article, data source, and findings, and justification of replication or reproduction).

  • Remember, the ultimate goal for this project is to produce a clear and coherent direct reproduction or conceptual replication of a published article. The final version should read like a professional article or blog that justifies and describes your reproduction or replication.

Required Contents

The final knitted file should contain the following (again, not necessarily numbered like below):

  1. A description of the article, data source, and specific findings that will be reproduced, along with a justification for the reproduction. As with the rest of the document, this should be professionally written - think of it like the introductory section of a published replication/reproduction article that must describe the original research and justify the replication/reproduction research.

  2. The table(s) and/or figure(s) found in the original study that you plan to reproduce should be included as an image(s). 

  3. your description/justification of the reproduction, you should move into the reproduction itself. The first step should be to download the data and explain the process in sufficient detail so that others can easily and accurately reproduce your work.

    • If you are downloading data from ICPSR, do it using reproducible R code with the icpsrdata package. With this approach, you should also add R code that automatically creates a unique data folder (e.g., “Project_Data”) for this project as a subfolder in a working directory created specifically for this project (e.g., “Project_work/Project_Data”) before downloading, then use icpsrdata package to download to your new folder. Recall, this process ensures that anyone who runs your RMD file on their own computer will automatically create the appropriate subfolder and then download the data to the correct folder as well. (If you do not remember how to do this, go back and review your earlier R Assignments.)
    • If the data are not on ICPSR, you should download it manually and save it in a unique data subfolder (e.g., “Project_Data”) within a working directory created specifically for this project (e.g., “Project_work/Project_Data”). Then, you should describe the exact procedures that others will need to follow to be able to reproduce your work. E.g., you will need to describe where and how to download the data and then where to save it (e.g., in a specifically-titled subfolder within the working directory containing your RMD file). It is important to provide detailed and careful descriptions of these procedures to ensure that others can easily and accurately reproduce your work.
    • Again, the point here is not to treat this like an assignment; rather, you should treat it like you are drafting an article in R Markdown, complete with code and detailed descriptions throughout that you would also submit as supplementary materials to accompany the trimmed-down and polished published article.
  4. Read the data into R and save it as an object.

  5. Identify and describe all the variables (i.e., “key variables”) needed to reproduce the original article’s table(s) or figure(s).

  6. Summarize the raw versions of the key variables or items (e.g., attributes/response labels; summary/descriptive statistics) using packages and commands that you learned in earlier assignments.

    • For descriptive statistics, we recommend you use the sjmisc package, which you learned about in R Assignment 4. Recall, using that package and tidyverse, you can easily generate frequency and descriptive statistics tables (e.g.1mydata %>% frq(myvar1) or mydata %>% descr(myvar1, myvar2)).
  7. Recode all the variables (e.g., using mutate) if that is needed for reproduction.

  8. Create polished and publication-ready versions of the table(s) and/or figure(s) you are reproducing or replicating (e.g., with gt package for tables ggplot2 package for figures).

  9. Interpret or reinterpret the results of your reproduction or conceptual replication indpendentinly and in relation to the original article.

  10. A conclusion section where you describe any issues you encountered with reproducing or replicating the results of the published article. 

Submit Assignment

*Upon completing the assignment:

  1. “knit” your final RMD file to html format and save it using an informative file name (e.g., “LastName_CRM495_RR-Project-Phase5_YEAR_MO_DY”) within a file structure you create for this assignment (e.g., “LastName_CRM495_RR-Phase4”).

    • Note: See “R Assignment #4” for details on creating a reproducible file structure.
  2. Submit your knitted html file on Canvas.

  3. Place a copy of your root folder in your LastName_495_commit folder on OneDrive.

    • Note: The root folder should contain your reproducible file structure for this assignment. This means it should include your image files and anything else necessary to reproduce your knitted html document with “one click.”

Evaluation Criteria

Some things I will look for when evaluating/providing feedback:

  • Description/Justification: Does the author describe and justify the reproduction/replication project aims clearly and effectively? Is the original study included in one of the project folders? Can I find the table or figure in the original study that the author is attempting to reproduce? Is the original study and that specific table/figure described clearly and accurately? Do they link to and cite the published study in the RMarkdown file?

  • Description of procedures and results throughout the document: Do you write about what you are doing throughout the markdown file? In other words, are you describing what you are doing and what you are finding in the data (e.g., are your descriptives the same as the original article, are they different? Is there anything unclear in the original article that makes it difficult to decipher exactly how they coded a variable?)? See R Assignments #5, #6, and #7 Wallkthrough/Instructions for some examples of this.

  • Project File Structure: Does the root folder for the assignment contain a reproducible file structure? Are there separate and clearly marked folders following best practices (e.g., Data; Articles; Images)? Can I open the RMarkdown file and run it with one click?

  • R Code Reproducibility: After installing any necessary packages, can I successfully run all R Code chunks, or does running the code generate errors? If errors are generated, is it immediately obvious what those errors are, and can I fix them with minimal effort to continue the review of R Code chunks?