The primary purpose of this first project assignment is to find an article on a topic of interest to you that has data available online via ICSPR (or another repository); eventually, you will be required to use these data in an attempt to reproduce a basic descriptive finding reported in a table or figure from the article. A secondary purpose of this assignment is to develop a sense of how (un)common it is to find research articles in the top journals of your field for which the authors have openly shared their data and code for reproducibility purposes. For more information about why and how to share data/code and other best practices for conducting reproducible research, check out here, here, here, and here.
At this point, I assume you are familiar with RStudio and with creating R Markdown (RMD) files. If not, please review R Assignments 1 & 2.
Additionally, I assume you know how to search for articles and then access them where available; there are numerous ways to do this. If it were me, I might start by searching for articles using the search function on each journal’s webpage. Upon finding articles of interest, I might then search for the specific articles using the title, author, and/or keywords in Google Scholar; you can also do an “advanced search” for specific topics published in specific journals. If you decide to use Google Scholar, I recommend adding your university to the “library links” setting for quick authentication and access via your library. Note that unlike in the link to instructions for IUPUI, you will want to check your university to generate the proper authentication links in Google Scholar. Finally, I assume you know how to properly format an article citation (e.g., using APA or Chicago style); Google Scholar can also help you with getting the full citation.
[]
followed immediately by
parentheses ()
containing the link.[Brauer, Day, & Hammond 2019](https://journals.sagepub.com/doi/10.1177/0049124119826158)
generates the following: Brauer,
Day, & Hammond 2019It’s worth familiarizing yourself with the ICPSR webpage for a specific dataset as it has some pretty standard things that can be useful. Here, we’ll use Wave 1 of the National Youth Survey from 1976. You’ll learn more about the NYS data in your next R assignment. For right now I just want to walk you through a typical ICPSR landing page.
Under the “At A Glance” tab, it provides a basic overview of the data including a a summary, citation, funding sources, restrictions, as well as some basics about the scope and methodology of the project. Another potentially useful thing for students is the “Data-related Publications” tab:
This provides a list of the publications that have used Wave 1 of the NYS data series. It can provide you with some potential studies that may be of interest for replicating or, if you are trying to use this data for original research, see what exactly has been done with this data before. There is other stuff to peruse on each specific data’s ICPSR site, but for this class, an important feature is the “Data & Documentation” tab. This is where you can actually download the data.
If you wanted to download the data, along with the codebook, setup files, and anything else associated with the dataset (this is not necessary for this assignment), you would click on the download icon.