Assumptions & Ground Rules

The primary purpose of this first project assignment is to familiarize yourself with the National Youth Survey (NYS) and begin to develop some ideas for research topics or questions that you will eventually examine with a replication study.

  • Since I’m requiring you to use the NYS data for your replication and reproducibility project, I’m trying to get you to identify things you are interested in that can potentially be examined using the NYS data. Then, once you know what you could potentially do with the NYS, in a subsequent assignment, I will ask you to identify specific research on that topic, including published studies that have used the NYS.

At this point, all I assume is that you are familiar with RStudio and with creating R Markdown (RMD) files. If not, please review R Assignments 1 & 2. Prior experience with reviewing a codebook and/or variable information on ICPSR will be beneficial, but not necessary as I will try to walk you through those tasks in this assignment.

Part 1: Introducing the National Youth Survey

Goal: Provide a brief introduction to the NYS

The National Youth Survey, overseen by Delbert Elliott, was one of the first national-level longitudinal studies specifically designed to measure self-reported crime and has been a popular data source for many high profile criminological studies. Currently, ICPSR lists 412 total publications. Originally, it was designed to study youth between the ages of 11 and 17 and followed them for a total of five years from 1976 to 1980. Subsequent waves have also been collected at varying intervals (e.g., 1983, 1986, and 1992), and it is now called the National Youth Survey - Family Study (also see here).

ICPSR has the first seven waves of data publicly available and any one or a series of these waves will serve as the data for your replication and reproducibility project for this class.

Here is the general description of the series from ICPSR:

For this series, parents and youth were interviewed about events and behavior of the preceding year to gain a better understanding of both conventional and deviant types of behavior by youths. Data were collected on demographic and socioeconomic status of respondents, disruptive events in the home, neighborhood problems, parental aspirations for youth, labeling, integration of family and peer contexts, attitudes toward deviance in adults and juveniles, parental discipline, community involvement, drug and alcohol use, victimization, pregnancy, depression, use of outpatient services, spouse violence by respondent and partner, and sexual activity. Demographic variables include sex, ethnicity, birth date, age, marital status, and employment of the youths, and information on the marital status and employment of the parents.

In what follows, I will try to walk you through the ICPSR website from where you will ultimately download the data for your project. For the moment, you will only need to access the basic description, variable list, and codebook for the data.

Part 2: ICPSR and the NYS Data Series

Goal: Familiarize yourself with ICPSR and NYS series landing page

It’s worth familiarizing yourself with the ICPSR webpage for a specific dataset or a series of data sets as it has some pretty standard things that can be useful. Here, we’ll start with the NYS Series page and then review the specifics for Wave 1 of the National Youth Survey from 1976.

Here is the landing page for the NYS “series” of data: