The purpose of this fourth assignment is to help you use R to complete some of the Assignment 5 exercises adapted from the SPSS Exercises at the end of Chapter 4 in Bachman, Paternoster, & Wilson’s Statistics for Criminology & Criminal Justice, 5th Ed.

This chapter focused on measures of central tendency (e.g., mean, median, and mode,) and their advantages and disadvantages as single statistical descriptions of a data distribution. Likewise, in this assignment, you will learn how to use R to calculate measures of central tendency and other statistics (e.g., skewness; kurtosis) that us help standardize and efficiently describe the shape of a data distribution. You will also get additional practice with creating frequency tables and simple graphs in R. As with previous assignments, you will be using R Markdown (with R & R Studio) to complete and submit your work.

- know how to use the base R
`head()`

function to quickly view a snapshot of your data - know how to use the
`glimpse()`

function to quickly view all columns (variables) in your data - be able to improve some knitted tables by piping a function’s
results to
`gt()`

(e.g.,`head(data) %>% gt()`

)

- be able to calculate measures of central tendency for a variable
distribution using base R functions
`mean()`

and`median()`

. - know how calculate central tendency and other basic descriptive
statistics for specific variables in a dataset using
`summarytools::descr()`

function - have more practice with using the
`$`

operator to reference or call a named element from a list or data object, such as a specific variable in a data file (e.g.,`mean(data$variable`

))? - have more practice with creating frequency tables using
`sjmisc::frq()`

and`summarytools::freq()`

- have more practice with generating simple histograms using
`ggplot()`

We are building on objectives from Assignments 1-3. By the start of this assignment, you should already know how to:

- create an R Markdown (RMD) file and add/modify text, level headers, and R code chunks within it
- install/load R packages and use hashtags (“#”) to comment out sections of R code so it does not run
- recognize when a function is being called from a specific package
using a double colon with the
`package::function()`

format - read in an SPSS data file in an R code chunk using
`haven::read_spss()`

and assign it to an R object using an assignment (`<-`

) operator

- use the
`$`

symbol to call a specific element (e.g., a variable, row, or column) within an object (e.g., dataframe or tibble), such as with the format`dataobject$varname`

- use a tidyverse
`%>%`

pipe operator to perform a sequence of actions - knit your RMD document into an Word file that you can then save and submit for course credit

- use
`here()`

for a simple and reproducible self-referential file directory method

- use
`sjPlot::view_df()`

to quickly browse variables in a data file - use
`attr()`

to identify variable and attribute value labels

- recognize when missing values are coded as
`NA`

for variables in your data file - select and recode variables using dplyr’s
`select()`

,`mutate()`

, and`if_else()`

functions

- use
`summarytools::dfsummary()`

to quickly describe one or more variables in a data file - create frequency tables with
`sjmisc:frq()`

and`summarytools::freq()`

functions - sort frequency distributions (lowest to highest/highest to lowest)
with
`summarytools::freq()`

- create basic graphs using ggplot2’s
`ggplot()`

function

*If you do not recall how to do these things, first review
Assignments 1-4.*

Additionally, you should have read the assigned book chapter and reviewed the SPSS questions that correspond to this assignment, and you should have completed any other course materials (e.g., videos; readings) assigned for this week before attempting this R assignment. In particular, for this week, I assume you understand:

- measures of central tendency: mean, median, and mode
- how to recognize a unimodal or a bimodal distribution and to identify the mode of a distribution
- how to calculate the median position from raw data or from grouped data
- how to calculate the arithmetic mean from raw data, grouped data, or a frequency distribution
- how skewness affects measures of central tendency
- comparative advantages and disadvantages of the mean and the median

Goal: Create a new RMD file for Assignment 5

*( Note:*

In the last assignment, you learned how to use
`sjmisc::frq()`

and `summarytools::freq()`

functions to generate frequency tables for variables. You also learned
about the `summarytools::dfsummary()`

function for quickly
summarizing all or a subset of the variables in a data object. Lastly,
you learned how to select and recode variables using dplyr’s
`mutate()`

and `if_else()`

functions as well as
how to display data in graphs using `ggplot()`

. In this
assignment, you will decide which measure of central tendency is most
appropriate for a given variable, then use frequency tables and R
functions to calculate measures of central tendency and other univariate
descriptive statistics.

- Go to your CRIM5305_L folder, which should contain the R Markdown
file you created for Assignment 4 (named
**YEAR-MO-DY_LastName_CRIM5305_Assign04**). Click to open the R Markdown file.- Remember, we open RStudio in this way so the
`here`

package will automatically set our CRIM5305_L folder as the top-level directory.

- Remember, we open RStudio in this way so the
- In RStudio, open a new R Markdown document. If you do not recall how
to do this, refer to Assignment 1.

- The dialogue box asks for a
**Title**, an**Author**, and a**Default Output Format**for your new R Markdown file.- In the
**Title**box, enter*CRIM5305 Assignment 5*. - In the
**Author**box, enter your First and Last Name (e.g.,*Caitlin Ducate*). - Under
**Default Output Format**box, ensure “Word” is selected

- In the
- Remember that the new R Markdown file contains a simple
pre-populated template to show users how to do basic tasks like add
settings, create text headings and text, insert R code chunks, and
create plots. Be sure to delete this text before you begin
working.

- Create a second-level heading titled: “Part 1 (Assignment 4.1)” a. Remember, a second-level heading starts with two hashtags followed by a space and the heading title, like this: ## Heading Title
- This assignment must be completed by the student and the student
alone. To confirm that this is your work, please begin all assignments
with this text:
- This R Markdown document contains
*my work*for Assignment 5. It is**my work**andmy work.*only*

- This R Markdown document contains

Goal: Reading in Data and Creating Frequency Table from Youth Data

We will begin by reading in the Youth dataset and creating a frequency table of the “parnt2” variable. The frequency table will allow us to answer the first question on pages 100-101.

- Create a second-level header titled: “Part 2 (Assignment
5.2).”

- Create a third-level header in R Markdown (hereafter, “RMD”) file
titled: “Load Libraries”
- Insert an R chunk.
- Inside the new R code chunk, load the following packages:
`tidyverse`

,`haven`

,`here`

,`sjmisc`

,`sjPlot`

,`summarytools`

, and`gt`

.**Note:**A new package - “gt” is listed above. Before loading it, remember that you must first install the package. Also, recall that you only need to install packages one time, but you must load them each time you start a new R session.

- After your first code chunk, create another third-level header in
RMD titled: “Read Data into R”
Insert another R code chunk.

In the new R code chunk, read and assign the “Youth_0.sav” SPSS datafile into an R data object named

`YouthData`

.- Forget how to do this? Refer to instructions in Assignment 1.

In the same code chunk, on a new line below your read data/assign object command, type the name of your new R data object:

`YouthData`

.- Remember, this will call the object and provide a brief view of the data.
- Also, remember that you can get a similar but more visually appealing view by simply clicking on the object in the “Environment” window.
- Your R studio session should now look a lot like this:

`YouthData <- read_spss(here("Datasets", "Youth_0.sav")) YouthData`