## The purpose of this fourth assignment is to help you use R to complete some of the SPSS Exercises from the end of Chapter 4 in Bachman, Paternoster, & Wilson’s Statistics for Criminology & Criminal Justice, 5th Ed.

This chapter focused on measures of central tendency (e.g., mean, median, and mode,) and their advantages and disadvantages as single statistical descriptions of a data distribution. Likewise, in this assignment, you will learn how to use R to calculate measures of central tendency and other statistics (e.g., skewness; kurtosis) that us help standardize and efficiently describe the shape of a data distribution. You will also get additional practice with creating frequency tables and simple graphs in R. As with previous assignments, you will be using R Markdown (with R & R Studio) to complete and submit your work.

- know how to use the base R
`head()`

function to quickly view a snapshot of your data - know how to use the
`glimpse()`

function to quickly view all columns (variables) in your data - be able to improve some knitted tables by piping a function’s
results to
`gt()`

(e.g.,`head(data) %>% gt()`

)

- be able to calculate measures of central tendency for a variable
distribution using base R functions
`mean()`

,`median()`

, and`mode()`

. - know how calculate central tendency and other basic descriptive
statistics for specific variables in a dataset using
`summarytools::descr()`

and`psych::describe()`

functions - have more practice with using the
`$`

operator to reference or call a named element from a list or data object, such as a specific variable in a data file (e.g.,`mean(data$variable`

))? - have more practice with creating frequency tables using
`sjmisc::frq()`

and`summarytools::freq()`

- have more practice with generating simple histograms using
`ggplot()`

## We are building on objectives from Assignments 1-3. By the start of this assignment, you should already know how to:

- create an R Markdown (RMD) file and add/modify text, level headers, and R code chunks within it
- install/load R packages and use hashtags (“#”) to comment out sections of R code so it does not run
- recognize when a function is being called from a specific package
using a double colon with the
`package::function()`

format - read in an SPSS data file in an R code chunk using
`haven::read_spss()`

and assign it to an R object using an assignment (`<-`

) operator

- use the
`$`

symbol to call a specific element (e.g., a variable, row, or column) within an object (e.g., dataframe or tibble), such as with the format`dataobject$varname`

- use a tidyverse
`%>%`

pipe operator to perform a sequence of actions - knit your RMD document into an HTML file that you can then save and submit for course credit

- use
`here()`

for a simple and reproducible self-referential file directory method - Use
`groundhog.library()`

as an optional but recommended reproducible alternative to`library()`

for loading packages

- use
`sjPlot::view_df()`

to quickly browse variables in a data file - use
`attr()`

to identify variable and attribute value labels

- recognize when missing values are coded as
`NA`

for variables in your data file - select and recode variables using dplyr’s
`select()`

,`mutate()`

, and`if_else()`

functions

- use
`summarytools::dfsummary()`

to quickly describe one or more variables in a data file - create frequency tables with
`sjmisc:frq()`

and`summarytools::freq()`

functions - sort frequency distributions (lowest to highest/highest to lowest)
with
`summarytools::freq()`

- create basic graphs using ggplot2’s
`ggplot()`

function

*If you do not recall how to do these things, first review
Assignments 1-3.*

Additionally, you should have read the assigned book chapter and reviewed the SPSS questions that correspond to this assignment, and you should have completed any other course materials (e.g., videos; readings) assigned for this week before attempting this R assignment. In particular, for this week, I assume you understand:

- measures of central tendency: mean, median, and mode
- how to recognize a unimodal or a bimodal distribution and to identify the mode of a distribution
- how to calculate the median position from raw data or from grouped data
- how to calculate the arithmetic mean from raw data, grouped data, or a frequency distribution
- how skewness affects measures of central tendency
- comparative advantages and disadvantages of the mean and the median

As noted previously, for this and all future assignments, you MUST
type all commands in by hand. *Do not copy & paste except for
troubleshooting purposes (i.e., if you cannot figure out what you
mistyped).*

- Early on, you may have a lot of trouble getting your code to run due
to minor typos. This is normal.

- Remember, you are learning to read and write a new (coding) language. As with learning any new languages, we learn from practice - and from correcting our mistakes.

## Goal: Create a new RMD file for Assignment 4

*(**Note:**Remember that, when
following instructions, always substitute “LastName” for your own last
name and substitute YEAR_MO_DY for the actual date. E.g.,
2022_06_01_Fordham_K300Assign4)*

In the last assignment, you learned how to use
`sjmisc::frq()`

and `summarytools::freq()`

functions to generate frequency tables for variables. You also learned
about the `summarytools::dfsummary()`

function for quickly
summarizing all or a subset of the variables in a data object. Lastly,
you learned how to select and recode variables using dplyr’s
`select()`

, `mutate()`

, and `if_else()`

functions as well as how to display data in graphs using
`ggplot()`

. In this assignment, you will decide which measure
of central tendency is most appropriate for a given variable, then use
frequency tables and R functions to calculate measures of central
tendency and other univariate descriptive statistics.

- Go to your K300_L folder, which should contain the R Markdown file
you created for Assignment 3 (named
**YEAR_MO_DY_LastName_K300Assign3**). Click to open the R Markdown file.- Remember, we open RStudio in this way so the
`here`

package will automatically set our K300_L folder as the top-level directory.

- Remember, we open RStudio in this way so the
- In RStudio, open a new R Markdown document. If you do not recall how
to do this, refer to Assignment 1.

- The dialogue box asks for a
**Title**, an**Author**, and a**Default Output Format**for your new R Markdown file.- In the
**Title**box, enter*K300 Assignment 4*. - In the
**Author**box, enter your First and Last Name (e.g.,*Tyeisha Fordham*). - Under
**Default Output Format**box, ensure “HTML” is selected (HTML is usually the default selection)

- In the
- Remember that the new R Markdown file contains a simple
pre-populated template to show users how to do basic tasks like add
settings, create text headings and text, insert R code chunks, and
create plots. Be sure to delete this text before you begin
working.

- This assignment must be completed by the student and the student
alone. To confirm that this is your work, please begin all assignments
with this text:
- This R Markdown document contains
*my work*for Assignment 3. It is**my work**andmy work.*only*

- This R Markdown document contains

## Goal: Reading in Data and Creating Frequency Table from Youth Data

We will begin by reading in the Youth dataset and creating a frequency table of the “parnt2” variable. The frequency table will allow us to answer the first question on pages 100-101.

- Create a second-level header titled: “Part 1 (Assignment
4.1).”

- Create a third-level header in R Markdown (hereafter, “RMD”) file
titled: “Load Libraries”
- Insert an R chunk.
- Inside the new R code chunk, load the following packages:
`tidyverse`

,`haven`

,`here`

,`sjmisc`

,`sjPlot`

,`summarytools`

, and`gt`

.**Note:**A new package - “gt” is listed above. Before loading it, remember that you must first install the package. Also, recall that you only need to install packages one time, but you must load them each time you start a new R session. Finally, remember that you can optionally use (and we recommend)`groundhog.library()`

to improve the reproducibility of your script.

- After your first code chunk, create another third-level header in
RMD titled: “Read Data into R”
- Insert another R code chunk.
- In the new R code chunk, read and assign the “Youth_0.sav” SPSS
datafile into an R data object named
`YouthData`

.- Forget how to do this? Refer to instructions in Assignment 1.

- In the same code chunk, on a new line below your read data/assign
object command, type the name of your new R data object:
`YouthData`

.- Remember, this will call the object and provide a brief view of the data.
- Also, remember that you can get a similar but more visually appealing view by simply clicking on the object in the “Environment” window.
- Your R studio session should now look a lot like this:

```
YouthData <- read_spss(here("Datasets", "Youth_0.sav"))
YouthData
```

```
## # A tibble: 1,272 × 23
## Gender v2 v21 v22 v63 v77 v79 v109 v119 parnt2
## <dbl+lbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl>
## 1 1 [male] 15 36 15 3 [usual… 1 [alw… 3 [som… 5 [a v… 1 [hur… 5
## 2 0 [female] 15 3 5 4 [alway… 1 [alw… 2 [usu… 4 [a b… 1 [hur… 8
## 3 1 [male] 15 20 6 4 [alway… 1 [alw… 1 [alw… 4 [a b… 2 [hur… 8
## 4 0 [female] 15 2 4 2 [somet… 1 [alw… 2 [usu… 5 [a v… 3 [hur… 4
## 5 0 [female] 14 12 2 4 [alway… 3 [som… 5 [nev… 5 [a v… 3 [hur… 7
## 6 1 [male] 15 1 3 4 [alway… 1 [alw… 1 [alw… 5 [a v… 3 [hur… 7
## 7 0 [female] 16 10 3 4 [alway… 3 [som… 2 [usu… 4 [a b… 1 [hur… 8
## 8 0 [female] 15 25 10 2 [somet… 3 [som… 5 [nev… 4 [a b… 3 [hur… 5
## 9 0 [female] 15 6 10 4 [alway… 2 [usu… 3 [som… 5 [a v… 1 [hur… 8
## 10 1 [male] 15 15 8 4 [alway… 3 [som… 3 [som… 4 [a b… 2 [hur… 7
## # … with 1,262 more rows, and 13 more variables: fropinon <dbl>,
## # frbehave <dbl>, certain <dbl>, moral <dbl>, delinquency <dbl>,
## # d1 <dbl+lbl>, hoursstudy <dbl>, `filter_$` <dbl>, heavytvwatcher <dbl+lbl>,
## # studyhard <dbl+lbl>, supervision <dbl+lbl>, drinkingnotbad <dbl+lbl>,
## # Lowcertain_bin <dbl>
```