4  Data Quality Assessment

Show code
library(tidyverse)
library(easystats)
library(haven)
library(here)
library(modelsummary)
library(sjlabelled)
library(gt)
library(htmltools)
library(janitor)

4.0.1 Introduction

In our contract with Dr. Kotlaja, we mentioned performing data quality checks and specifically duplication analyses. Analyzing duplicate survey responses has been put forward as potential tool in analyzing the quality of data and potentially identifying fraud. Although we have some reservations about the utility of near duplication analyses for identifying fraudulent interviewer behavior (see Day, Brauer, and Kotlaja, 2023), we do think it’s a useful way to learn about one’s data. And, given the unique features of the data collection process (e.g., using community members to administer the surveys), such analyses may serve as an entree to examining interviewer effects in the quality of the data.

Below we will walk through two approaches to analyzing duplication in survey data. First, we’ll check for exact duplicates. Second, we’ll examine near-duplication using the method proposed by Kuriakose and Robbins (2016).

In doing this, we’ll utilize the combined version of the community survey data we created in the previous data cleaning overview. However, we’ll trim the data so that it only includes substantive variables of interest and not administrative variables or, in this case, the variables that asked respondents to identify their neighborhood boundaries. We also removed basic demographic variables (e.g., sex, age, race/ethnicity) as these are often used in sampling designs and are thus less apt to be erroneously duplicated (see footnote 9 on p.286 in Kuriakose and Robbins’ article). We also removed most of the variables where respondents entered text (often identified with a “TEXT” suffix to the variable name)1. Most of these were missing and unlikely to be directly duplicated. We also removed the variables (Q31_10 - Q31_18) related to neighborhood monitoring platforms (e.g., Next Door) as these included a lot of missing data and Q30 already asked a “yes/no” question about whether the respondent used such platforms. We also converted to the employment variables (respondents’ employment status and other household adult’s employment status) to single variables rather than the 10 variables across which each potential employment status was indicated.

Show code
# RUN THIS MANUALLY IN R CONSOLE BEFORE RENDERING:
# Identifies necessary packages for FULL project & install as needed 
# (This code is not executed during rendering to avoid hanging)

# Check and install renv if needed
if (!requireNamespace("renv", quietly = TRUE)) {
  cat("renv package not found. Installing...\n")
  install.packages("renv")
}

# Scan for dependencies
cat("Scanning project files for package dependencies...\n")
library(renv)
deps <- renv::dependencies()
unique_packages <- unique(deps$Package)

# Filter out base packages
base_packages <- c("base", "datasets", "graphics", "grDevices", "methods", 
                   "stats", "utils", "tools", "parallel", "compiler", "grid",
                   "splines", "tcltk", "stats4")
needed_packages <- setdiff(unique_packages, base_packages)

# Check what's actually missing (not already installed)
installed_packages <- installed.packages()[,"Package"]
missing_packages <- needed_packages[!needed_packages %in% installed_packages]

cat("Total dependencies found:", length(unique_packages), "\n")
cat("After removing base packages:", length(needed_packages), "\n")
cat("Already installed:", length(needed_packages) - length(missing_packages), "\n")
cat("Missing packages:", length(missing_packages), "\n")

if (length(missing_packages) > 0) {
  cat("Installing missing packages:", paste(missing_packages, collapse = ", "), "\n")
  install.packages(missing_packages)
  cat("Installation complete!\n")
} else {
  cat("All required packages are already installed!\n")
}
Show code
kc_combsurv_clean <- readRDS(here("Data", "kc_combsurv_clean.rds"))

kc_combsurv_noad <- kc_combsurv_clean %>%
  select(-c(survey, Q66, INTERVIEWER, NBHD, Q115, Q0_Id, Q0_Name, Q0_Size, Q0_Type,
            Q1, Q1_1_TEXT, Q2, Q116_Id, Q116_Name, Q116_Size, Q116_Type, 
            Q31_21, SurveyDate, Duration__in_seconds_, Finished, Q122,
            Q41, Q41_5_TEXT, Q42_1:Q42_7_TEXT, Q43, Q31_10:Q31_18_TEXT, Q41_5_TEXT,
            Q51_10_TEXT, Q52.2_10_TEXT, Q53_4_TEXT,
            Q14, Q15, Q27, Q28, Q49, Q50, Q55)) %>%
  mutate(employment = ifelse(Q51_1 == 1, 1, NA),
         employment = ifelse(Q51_2 == 1, 2, employment),
         employment = ifelse(Q51_3 == 1, 3, employment),
         employment = ifelse(Q51_4 == 1, 4, employment),
         employment = ifelse(Q51_5 == 1, 5, employment),
         employment = ifelse(Q51_6 == 1, 6, employment),
         employment = ifelse(Q51_7 == 1, 7, employment),
         employment = ifelse(Q51_8 == 1, 8, employment),
         employment = ifelse(Q51_9 == 1, 9, employment),
         employment = ifelse(Q51_10 == 1, 10, employment),
         othadult_employ = ifelse(Q52.2_1 == 1, 1, NA),
         othadult_employ = ifelse(Q52.2_2 == 1, 2, othadult_employ),
         othadult_employ = ifelse(Q52.2_3 == 1, 3, othadult_employ),
         othadult_employ = ifelse(Q52.2_4 == 1, 4, othadult_employ),
         othadult_employ = ifelse(Q52.2_5 == 1, 5, othadult_employ),
         othadult_employ = ifelse(Q52.2_6 == 1, 6, othadult_employ),
         othadult_employ = ifelse(Q52.2_7 == 1, 7, othadult_employ),
         othadult_employ = ifelse(Q52.2_8 == 1, 8, othadult_employ),
         othadult_employ = ifelse(Q52.2_9 == 1, 9, othadult_employ),
         othadult_employ = ifelse(Q52.2_10 == 1, 10, othadult_employ)) %>%
  select(-c(Q51_1:Q51_10, Q52.2_1:Q52.2_10))

Kuriakose and Robbins (2016) also recommend removing any variable with 10% or more missing values and any observation with 25% or more missing values as missing values can inflate near duplication detection. The changes described above largely take care of this for the variables. We are going to forgo removing other variables that may have more than 10% missing as well as observations with 25% or more missing as this is simply an initial check. In this way, we are being conservative in our approach and allowing for a potentially higher rate of near duplication than actually exists. If we find potentially problematic near duplication, we can always go back to the more stringent standards outlined by Kuriakose and Robbins (2016).

Ultimatley, the trimmed data set has 205 variables, 71 less than the raw version of the combined data.

Before moving on, let’s take a look at the basic descriptive and distributional features of the data to make sure there is nothing else we think should be removed prior to looking for duplicates.

Show code
library(kableExtra)

kc_combsurv_noad %>%
  haven::zap_labels() %>%
  select(-ID) %>%
  datasummary_skim(output = "kableExtra") %>%
  kable_styling(
    bootstrap_options = c("striped", "hover"),
    fixed_thead = TRUE,
    full_width = TRUE,
    html_font = "Arial",
    font_size = 12,
    position = "left",
    latex_options = "scale_down"
  ) %>%
  scroll_box(width = "100%", height = "500px")
Unique Missing Pct. Mean SD Min Median Max
On the whole, do you like or dislike this neighborhood as a place to live? 5 3 1.7 0.7 1.0 2.0 4.0
Suppose that for some reason, you HAD to move away from this neighborhood. Would you miss the neighborhood very much, somewhat, not much, or not at all? 5 3 2.0 1.0 1.0 2.0 4.0
Q2.3 6 2 3.0 1.5 1.0 3.0 5.0
How frequently do you (or other household members) attend a church, mosque, or any other religious organization? 5 3 2.5 1.2 1.0 3.0 4.0
Is this in your neighborhood? 3 34 1.6 0.5 1.0 2.0 2.0
How frequently do you (or any household members) attend a block party, tenant/neighborhood association, or community council (i.e., Community Health Ambassadors, United NH Initiative, Block Captain, HOA)? 5 4 3.1 1.0 1.0 3.0 4.0
Is this in your neighborhood? 3 50 1.2 0.4 1.0 1.0 2.0
How often do you (or any household members) attend a business or civic group such as Chamber events, Masons, Elks, or Rotary Club? 5 4 3.6 0.9 1.0 4.0 4.0
Is this in your neighborhood? 3 77 1.4 0.5 1.0 1.0 2.0
How often do you (or any household members) attend an ethnic or nationality club (such as a Latino, Asian or Afro-American club) (sorority/fraternity, 100 Black Men, D9)? 5 8 3.6 0.9 1.0 4.0 4.0
Is this in your neighborhood? 3 80 1.6 0.5 1.0 2.0 2.0
How often do you (or any household members) attend a local political organizations (i.e., City Town Hall meetings, Election Events)? 5 3 3.3 1.0 1.0 4.0 4.0
Is this in your neighborhood? 3 62 1.4 0.5 1.0 1.0 2.0
Sometimes people in a neighborhood do things to take care of a local problem, or to make the neighborhood a better place to live. Please tell me (if you have/ or any member of your household has) been involved in the following activities since you lived in this neighborhood (nbhd): - Have you (or any member of your household) spoken with a local politician or an elected local official like your council district member about a neighborhood problem or neighborhood improvement? 3 1 1.7 0.5 1.0 2.0 2.0
Sometimes people in a neighborhood do things to take care of a local problem, or to make the neighborhood a better place to live. 3 1 1.7 0.5 1.0 2.0 2.0
Sometimes people in a neighborhood do things to take care of a local problem, or to make the neighborhood a better place to live. Please tell me (if you have/ or any member of your household has) been involved in the following activities since you lived in this neighborhood (nbhd): - Have you (or any member of your household) attended a meeting of a block or neighborhood group about a neighborhood problem or neighborhood improvement? 3 1 1.7 0.5 1.0 2.0 2.0
Sometimes people in a neighborhood do things to take care of a local problem, or to make the neighborhood a better place to live. Please tell me (if you have/ or any member of your household has) been involved in the following activities since you lived in this neighborhood (nbhd): - Have you (or any member of your household) talked to a local religious leader or minister to help with a neighborhood problem or with neighborhood improvement? 3 1 1.8 0.4 1.0 2.0 2.0
Sometimes people in a neighborhood do things to take care of a local problem, or to make the neighborhood a better place to live. Please tell me (if you have/ or any member of your household has) been involved in the following activities since you lived in this neighborhood (nbhd): - Have you (or any member of your household) gotten together with neighbors to do something about a nbhd problem or to organize nbhd improvement? 3 1 1.7 0.5 1.0 2.0 2.0
All things considered, what do you think this neighborhood will be like a few years from now? Will it be a better place to live, stay the same, or will it be worse? 4 4 1.7 0.7 1.0 2.0 3.0
How do you think your neighborhood compares with most other neighborhoods in the city? Is It better, the same, or worse? 4 5 1.6 0.7 1.0 1.0 3.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Safe neighborhood (i.e.., perception of safety, neighborhood watch/block captain) 6 2 2.5 1.1 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Sense of community (i.e., neighborly interactions, community events) 6 3 2.5 1.1 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Schools 6 23 2.8 1.3 1.0 3.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Access to Public Transportation 6 8 2.4 1.2 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Housing 6 3 2.4 1.1 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Health Care 6 6 2.5 1.1 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Parks and Green Spaces 6 2 2.4 1.1 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Walkability and Accessibility (i.e., ease of walking to shops, restaurants, and other amenities) 6 3 2.6 1.3 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Cleanliness and City Maintenance (trash collection, street cleaning, cleanliness of public space) 6 2 2.7 1.2 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Cultural Diversity (presence of cultural or ethnic communities) 6 3 2.4 1.0 1.0 2.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Public Infrastructure (i.e., quality sidewalks, street lighting, road conditions) 6 2 3.0 1.3 1.0 3.0 5.0
People like different things about their neighborhood. How do you feel about these different aspects of your neighborhood? - Local businesses (i.e., grocery stores, cafes, small businesses) 6 3 2.5 1.2 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - If there is a problem around here, the neighbors get together to deal with it. 6 5 2.8 1.3 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - This is a close-knit neighborhood. 6 3 2.7 1.3 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - When you get right down to it, no one in this neighborhood cares much about what happens to me. 6 4 3.4 1.4 1.0 4.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - There are adults in this neighborhood that children can look up to. 6 7 2.2 1.2 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - People around here are willing to help their neighbors. 6 2 2.0 1.1 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - People in this neighborhood generally don’t get along with each other. 6 4 3.9 1.2 1.0 4.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - You can count on adults in this neighborhood to watch out for children and ensure they are safe and don't get in trouble. 6 6 2.2 1.2 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - If I had to borrow $60 in an emergency, I could borrow it from a neighbor. 6 6 2.9 1.5 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - When I'm away from home, I know that my neighbors will keep their eyes open for possible trouble to my place. 6 2 1.9 1.2 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - In this neighborhood people mostly go their own way. 6 3 2.5 1.3 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - People in this neighborhood do not share the same values 6 8 3.2 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - If I were sick I could count on my neighbors to shop for groceries for me. 6 5 2.8 1.5 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - People in this neighborhood can be trusted. 6 3 2.3 1.2 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Children around here have no place to play but the street. 6 6 3.2 1.5 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Adults in this neighborhood know who the local children are. 6 9 2.7 1.3 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The equipment and buildings in the park or playground that is closest to where I live are well kept. 6 11 2.3 1.3 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The park or playground closest to where I live is safe during the day. 6 10 2.1 1.2 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The park or playground closest to where I live is safe at night. 6 12 3.4 1.4 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Parents in this neighborhood generally know each other. 6 12 2.5 1.3 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THE CHARACTERISTICS OF YOUR NEIGHBORHOOD AND THE PEOPLE IN IT. (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Parents in this neighborhood know who their children’s friends are 6 19 2.6 1.2 1.0 2.0 5.0
NOW I AM GOING TO ASK ABOUT SOME THINGS YOU MIGHT DO WITH PEOPLE IN YOUR NEIGHBORHOOD. (1. Often, 2. Sometimes, 3. Rarely, 4. Never, 5. Don’t Know, 6. Refused) - Do favors for each other? such as watching each other's children, helping with shopping, lending garden or house tools, and other small acts of kindness 5 9 2.2 1.0 1.0 2.0 4.0
NOW I AM GOING TO ASK ABOUT SOME THINGS YOU MIGHT DO WITH PEOPLE IN YOUR NEIGHBORHOOD. (1. Often, 2. Sometimes, 3. Rarely, 4. Never, 5. Don’t Know, 6. Refused) - When a neighbor is not at home, watch over their property? 5 5 1.8 1.0 1.0 2.0 4.0
NOW I AM GOING TO ASK ABOUT SOME THINGS YOU MIGHT DO WITH PEOPLE IN YOUR NEIGHBORHOOD. (1. Often, 2. Sometimes, 3. Rarely, 4. Never, 5. Don’t Know, 6. Refused) - Ask each other advice about personal things such as child rearing or job openings? 5 11 2.6 1.1 1.0 3.0 4.0
NOW I AM GOING TO ASK ABOUT SOME THINGS YOU MIGHT DO WITH PEOPLE IN YOUR NEIGHBORHOOD. (1. Often, 2. Sometimes, 3. Rarely, 4. Never, 5. Don’t Know, 6. Refused) - Have parties or other get-togethers where other people in the neighborhood are invited? 5 7 2.6 1.0 1.0 3.0 4.0
NOW I AM GOING TO ASK ABOUT SOME THINGS YOU MIGHT DO WITH PEOPLE IN YOUR NEIGHBORHOOD. (1. Often, 2. Sometimes, 3. Rarely, 4. Never, 5. Don’t Know, 6. Refused) - Talk in each other's homes or on the street? 5 6 1.9 1.0 1.0 2.0 4.0
The next questions are about the families who live in the neighborhood. - About how many families in this neighborhood know each other? 6 13 3.1 0.9 1.0 3.0 5.0
The next questions are about the families who live in the neighborhood. - About how many people in this neighborhood would you say are religious or attend church regularly? 6 34 2.7 0.9 1.0 3.0 5.0
The next questions are about the families who live in the neighborhood. - How many people in this neighborhood would you say make part or all of their income from side hustles or alternative means? 6 38 2.6 1.0 1.0 3.0 5.0
The next questions are about the families who live in the neighborhood. - About how many adults in this neighborhood would you say make some or part of their income with a regular full or part-time job? 6 20 3.6 0.9 1.0 4.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THINGS THAT PEOPLE IN YOUR NEIGHBORHOOD MAY OR MAY NOT DO. (1. Definitely, 2. Very likely, 3. Likely, 4. Unlikely, 5. Very unlikely, 6. Definitely not, 9. NR) - If a group of neighborhood children were skipping school and hanging out on a street corner, how likely is it that your neighbors would do something about it? 7 11 3.2 1.5 1.0 3.0 6.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THINGS THAT PEOPLE IN YOUR NEIGHBORHOOD MAY OR MAY NOT DO. (1. Definitely, 2. Very likely, 3. Likely, 4. Unlikely, 5. Very unlikely, 6. Definitely not, 9. NR) - If some children were spray-painting graffiti on a local building, how likely is it that your neighbors would do something about it? 7 7 2.5 1.4 1.0 2.0 6.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THINGS THAT PEOPLE IN YOUR NEIGHBORHOOD MAY OR MAY NOT DO. (1. Definitely, 2. Very likely, 3. Likely, 4. Unlikely, 5. Very unlikely, 6. Definitely not, 9. NR) - If a child was showing disrespect to an adult, how likely is It that people in your neighborhood would scold that child? 7 11 3.4 1.5 1.0 3.0 6.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THINGS THAT PEOPLE IN YOUR NEIGHBORHOOD MAY OR MAY NOT DO. (1. Definitely, 2. Very likely, 3. Likely, 4. Unlikely, 5. Very unlikely, 6. Definitely not, 9. NR) - If a well-known neighbor was short of cash to start a business in the area, how likely is it that he or she would be able to borrow money from people in this neighborhood? 7 17 4.3 1.4 1.0 4.0 6.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THINGS THAT PEOPLE IN YOUR NEIGHBORHOOD MAY OR MAY NOT DO. (1. Definitely, 2. Very likely, 3. Likely, 4. Unlikely, 5. Very unlikely, 6. Definitely not, 9. NR) - If there was a fight in front of your house and someone was being beaten or threatened, how likely is it that your neighbors would break it up or call the police? 7 4 2.1 1.3 1.0 2.0 6.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT THINGS THAT PEOPLE IN YOUR NEIGHBORHOOD MAY OR MAY NOT DO. (1. Definitely, 2. Very likely, 3. Likely, 4. Unlikely, 5. Very unlikely, 6. Definitely not, 9. NR) - Suppose that because of budget cuts the fire station closest to your home was going to be closed down by the city. How likely is it that neighborhood residents would organize to try to do something to keep the fire station open? 7 12 2.8 1.6 1.0 3.0 6.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem is litter, broken glass or trash on the sidewalks and streets? 6 3 2.4 1.2 1.0 2.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem is graffiti on buildings and walls? 6 4 1.8 1.1 1.0 1.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem are vacant or deserted houses or storefronts? 6 4 2.2 1.2 1.0 2.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem is drinking in public? 6 5 2.1 1.3 1.0 1.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem are people selling or using drugs? 6 6 2.3 1.4 1.0 2.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem are groups of teenagers or adults hanging out in the neighborhood and causing trouble? 6 5 1.9 1.2 1.0 1.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem are different social groups who do not get along with each other? 6 8 1.8 1.3 1.0 1.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem is police not patrolling the area or responding to calls from the area? 6 5 2.4 1.4 1.0 2.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem is excessive use of force by police? 6 10 2.1 1.5 1.0 1.0 5.0
I'm going to read a list of things that are problems in some neighborhoods. For each, please tell me how much of a problem it is in your neighborhood. (1. Not at all a problem, 2. A minor problem 3. A moderate problem, 4. A major problem, 5. Don’t Know, 6. Not applicable/NR) - How much of a problem is lack of trust between local businesses and residents? 6 9 2.0 1.4 1.0 1.0 5.0
Q13 3 5 1.6 0.5 1.0 2.0 2.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Laws were made to be broken. 6 4 4.2 1.2 1.0 5.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - It's okay to do anything you want as long as you don't hurt anyone. 6 4 4.0 1.3 1.0 4.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - To make money, there are no right and wrong ways anymore, only easy ways and hard ways. 6 6 4.1 1.3 1.0 5.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Fighting between friends and family is nobody else's business 6 4 3.1 1.4 1.0 3.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Fighting among rival gangs/groups should be ignored by the police 6 4 4.7 0.7 1.0 5.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Nowadays a person has to live pretty much for today and let tomorrow take care of itself. 6 4 3.5 1.5 1.0 4.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Many people in this neighborhood are afraid to go out at night. 6 6 3.4 1.5 1.0 4.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - There are areas of this neighborhood where everyone knows trouble is expected. 6 5 3.1 1.6 1.0 3.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - You’re taking a big chance if you walk in this neighborhood alone after dark. 6 4 3.4 1.5 1.0 4.0 5.0
I am going to read you some statements people sometimes make. For each, please tell me whether you strongly agree, agree, disagree, or strongly disagree with each.(1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Most people in this neighborhood are moral. 6 7 2.1 1.1 1.0 2.0 5.0
I am going to describe some events that may or may not have happened in this neighborhood. Which of the following responses most closely describes how often each of these events have occurred in your neighborhood during the past six months? - How often was there a fight in this neighborhood in which a weapon was used? 5 17 1.6 0.8 1.0 1.0 4.0
I am going to describe some events that may or may not have happened in this neighborhood. Which of the following responses most closely describes how often each of these events have occurred in your neighborhood during the past six months? - How often was there a violent argument between neighbors? 5 16 1.5 0.8 1.0 1.0 4.0
I am going to describe some events that may or may not have happened in this neighborhood. Which of the following responses most closely describes how often each of these events have occurred in your neighborhood during the past six months? - Gang or group fights? 5 18 1.3 0.7 1.0 1.0 4.0
I am going to describe some events that may or may not have happened in this neighborhood. Which of the following responses most closely describes how often each of these events have occurred in your neighborhood during the past six months? - A sexual assault or rape? 5 23 1.2 0.5 1.0 1.0 4.0
I am going to describe some events that may or may not have happened in this neighborhood. Which of the following responses most closely describes how often each of these events have occurred in your neighborhood during the past six months? - A robbery or mugging? 5 17 1.6 0.9 1.0 1.0 4.0
I am going to describe some events that may or may not have happened in this neighborhood. Which of the following responses most closely describes how often each of these events have occurred in your neighborhood during the past six months? - Sideshow and Street Racing? 5 16 1.8 1.0 1.0 1.0 4.0
While you have lived in this neighborhood, has anyone ever used violence, such as in a mugging, fight, or sexual assault, against you in your neighborhood? 3 6 1.9 0.3 1.0 2.0 2.0
Was it in the past six months? 3 89 1.4 0.5 1.0 1.0 2.0
While you have lived in this neighborhood, has your home ever been broken into? 3 5 1.8 0.4 1.0 2.0 2.0
Was it in the past six months? 3 83 1.9 0.3 1.0 2.0 2.0
While you have lived in this neighborhood, have you had anything stolen from your yard, porch, garage, or elsewhere outside your home (but on your property) 3 5 1.6 0.5 1.0 2.0 2.0
Was it in the past six months? 3 61 1.6 0.5 1.0 2.0 2.0
Have you had property damaged, including damage to vehicles parked in the street, to the outside of your home, or to other personal property? 3 5 1.7 0.5 1.0 2.0 2.0
Was it in the past six months? 3 71 1.6 0.5 1.0 2.0 2.0
Now I would like to find out what you think about policing in this neighborhood (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - Police in this neighborhood are responsive to local issues. 6 9 2.5 1.3 1.0 2.0 5.0
Now I would like to find out what you think about policing in this neighborhood (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The police are doing a good job in dealing with problems that really concern people in this neighborhood 6 10 2.7 1.3 1.0 3.0 5.0
Now I would like to find out what you think about policing in this neighborhood (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The police have positive interactions with people in this neighborhood. 6 11 2.5 1.2 1.0 2.0 5.0
Now I would like to find out what you think about policing in this neighborhood (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The police are fair in their treatment of people in this neighborhood. 6 13 2.4 1.2 1.0 2.0 5.0
Now I would like to find out what you think about policing in this neighborhood (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The police are not doing a good job in preventing crime in this neighborhood 6 10 3.1 1.4 1.0 3.0 5.0
Now I would like to find out what you think about policing in this neighborhood (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The police do a good job in responding to people in this neighborhood after they have been victims of crime. 6 15 2.7 1.3 1.0 3.0 5.0
Now I would like to find out what you think about policing in this neighborhood (1. Strongly agree, 2. Somewhat agree, 3. Neither agree nor disagree, 4. Somewhat disagree, 5. Strongly disagree) - The police are not able to maintain order on the streets and sidewalks in this neighborhood. 6 10 3.4 1.4 1.0 4.0 5.0
I am going to read you a list of services for children and adolescents. For each, please tell me if it is offered in your neighborhood. - Is there a youth center for children or adolescents in your neighborhood? 4 3 2.1 0.8 1.0 2.0 3.0
I am going to read you a list of services for children and adolescents. For each, please tell me if it is offered in your neighborhood. - Are there recreation programs other than those offered in school (are these offered in your neighborhood)? 4 2 2.1 0.9 1.0 2.0 3.0
I am going to read you a list of services for children and adolescents. For each, please tell me if it is offered in your neighborhood. - Do the neighborhood schools offer any after-school programs-academic and/or recreational? 4 2 1.8 0.9 1.0 1.0 3.0
I am going to read you a list of services for children and adolescents. For each, please tell me if it is offered in your neighborhood. - Are mentoring services offered, like a Big Brothers or Big Sisters program? 4 2 2.2 0.9 1.0 3.0 3.0
I am going to read you a list of services for children and adolescents. For each, please tell me if it is offered in your neighborhood. - Are mental health services offered for children and adolescents in your neighborhood? 4 2 2.2 0.9 1.0 3.0 3.0
I am going to read you a list of services for children and adolescents. For each, please tell me if it is offered in your neighborhood. - Are there any crisis intervention services offered to children and adolescents in your neighborhood? 4 2 2.3 0.9 1.0 3.0 3.0
To what degree do you believe each of the following is a cause of crime? - Lack of mental health services 4 9 2.5 0.7 1.0 3.0 3.0
To what degree do you believe each of the following is a cause of crime? - Lack of family structure 4 9 2.6 0.6 1.0 3.0 3.0
To what degree do you believe each of the following is a cause of crime? - Not enough support for police 4 12 2.1 0.8 1.0 2.0 3.0
To what degree do you believe each of the following is a cause of crime? - Lack of homeless shelters. 4 8 2.5 0.7 1.0 3.0 3.0
To what degree do you believe each of the following is a cause of crime? - Easy access to guns. 4 11 2.7 0.6 1.0 3.0 3.0
To what degree do you believe each of the following is a cause of crime? - Not enough officers. 4 13 2.3 0.8 1.0 2.0 3.0
To what degree do you believe each of the following is a cause of crime? - Lack of jail space. 4 23 2.0 0.9 1.0 2.0 3.0
To what degree do you believe each of the following is a cause of crime? - Lenient judges/prosecutors 4 24 2.1 0.8 1.0 2.0 3.0
What are the chances that crime negatively impacts the following? - Property values 6 7 1.7 1.0 1.0 1.0 5.0
What are the chances that crime negatively impacts the following? - Service industry success (i.e. hotel, restaurant) 6 10 2.1 1.2 1.0 2.0 5.0
What are the chances that crime negatively impacts the following? - New Businesses coming to Kansas City 6 8 2.2 1.3 1.0 2.0 5.0
What are the chances that crime negatively impacts the following? - Tourism 6 10 2.4 1.3 1.0 2.0 5.0
What are the chances that crime negatively impacts the following? - Resident exodus 6 13 2.2 1.3 1.0 2.0 5.0
What are the chances that crime negatively impacts the following? - Career opportunities 6 10 2.5 1.3 1.0 2.0 5.0
What are the chances that crime negatively impacts the following? - Business investment 6 11 2.2 1.2 1.0 2.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More funding for mental health services 6 5 1.7 1.0 1.0 1.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - Promoting new businesses in Kansas City 6 6 2.4 1.3 1.0 2.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More tourism 6 9 2.8 1.4 1.0 3.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More Affordable housing options 6 6 1.7 1.0 1.0 1.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More career opportunities 6 5 1.7 1.0 1.0 1.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More business investment 6 8 2.1 1.2 1.0 2.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More funding for police 6 8 2.3 1.4 1.0 2.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - Tougher Sentencing 6 11 2.5 1.4 1.0 2.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More family interventions 6 8 1.9 1.0 1.0 2.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More supervised activities for youth 6 6 1.6 0.9 1.0 1.0 5.0
What are the chances that each of the following initiatives would be effective solutions to crime reduction? - More gun legislation (e.g., ban assault weapons and high-capacity magazines, universal background checks on all gun sales). 6 7 1.9 1.4 1.0 1.0 5.0
Now I would like to find out what you think about police in general. - I feel a moral obligation to obey the police 6 7 1.9 1.2 1.0 2.0 5.0
Now I would like to find out what you think about police in general. - The police can be trusted to make decisions that are right for my community. 6 6 2.8 1.3 1.0 3.0 5.0
Now I would like to find out what you think about police in general. - I feel a moral duty to support the decisions of police officers, even when I disagree with them 6 7 2.9 1.5 1.0 3.0 5.0
Now I would like to find out what you think about police in general. - I feel a moral duty to obey the instructions of police officers, even if I do not understand the reasons behind them 6 7 2.4 1.3 1.0 2.0 5.0
Now I would like to find out what you think about police in general. - People like me have no choice but to obey the police 6 8 2.7 1.4 1.0 2.0 5.0
Now I would like to find out what you think about police in general. - If you do not do what the police tell you they will treat you badly 6 7 2.1 1.3 1.0 2.0 5.0
Now I would like to find out what you think about police in general. - I only obey the police because I am afraid of them 6 8 3.2 1.5 1.0 3.0 5.0
How likely is it that you would…. - Call the police to report a crime that you witnessed. 5 8 1.7 0.9 1.0 1.0 4.0
How likely is it that you would…. - Help the police to find someone suspected of a crime by providing information 5 8 1.9 1.0 1.0 2.0 4.0
How likely is it that you would…. - Report dangerous or suspicious activities to the police 5 8 1.7 0.8 1.0 2.0 4.0
How likely is it that you would…. - Provide the police with information that helps them solve a crime 5 8 1.7 0.9 1.0 1.0 4.0
How likely is it that you would…. - Take an inexpensive item from a store without paying for it 5 7 3.7 0.7 1.0 4.0 4.0
How likely is it that you would…. - Take prescription medication that was not prescribed to you 5 7 3.7 0.7 1.0 4.0 4.0
How likely is it that you would…. - Consume alcohol in a place where you weren’t supposed to 5 7 3.5 0.8 1.0 4.0 4.0
How likely is it that you would…. - Park a car in a place that were not supposed to 5 8 3.3 0.9 1.0 4.0 4.0
How likely is it that you would…. - Exceed the speed limit while driving a motor vehicle 5 10 2.7 1.1 1.0 3.0 4.0
Q26 3 3 1.6 0.5 1.0 2.0 2.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Overall quality of life in the city. 6 5 2.4 1.0 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Feeling of safety in your neighborhood. 6 5 2.4 1.2 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Physical appearance of your neighborhood. 6 5 2.6 1.3 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Overall image of the city 6 5 2.6 1.1 1.0 2.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Quality of services provided by KCMO 6 5 2.9 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Value received for city tax dollars and fees 6 8 3.4 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Overall feeling of safety in the city 6 5 3.0 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Overall quality of education system 6 14 3.3 1.3 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Quality of police services 6 8 3.1 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Quality of neighborhood services 6 11 2.9 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Relationship neighborhood & police between 6 11 2.9 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Effectiveness of local police protection 6 8 3.0 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Enforcement of local traffic laws 6 8 3.0 1.2 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Parking enforcement services 6 12 2.8 1.1 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - How quickly police respond to emergencies 6 13 3.2 1.3 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - Responsiveness of police dept to resident concerns. 6 12 3.1 1.3 1.0 3.0 5.0
NOW I AM GOING TO READ SOME STATEMENTS ABOUT YOUR PERCEPTIONS OF KCMO. - The city’s overall efforts to prevent crime 6 8 3.4 1.2 1.0 3.0 5.0
Do you use technology (e.g., neighborhood apps, social media groups) to stay informed about crime or disorder in your neighborhood? 3 3 1.4 0.5 1.0 1.0 2.0
Have you ever reported suspicious activity or a crime in your neighborhood using technology (e.g., apps, social media)? 3 45 1.7 0.5 1.0 2.0 2.0
Do you feel that technology-based platforms (e.g., apps, social media) have made your neighborhood safer? 6 45 2.5 1.1 1.0 2.0 5.0
Have you noticed an increase in neighborhood disorder (e.g., vandalism, theft) through updates or alerts from these technology platforms? 5 45 2.5 0.8 1.0 3.0 4.0
Do you believe that increased use of technology in your neighborhood (e.g., security cameras, apps) has contributed to a decrease in crime or disorder? 6 45 2.6 1.1 1.0 3.0 5.0
The city is prepared to host the 2026 FIFA World Cup. 6 10 3.0 1.4 1.0 3.0 5.0
Hosting the 2026 FIFA World Cup will improve safety and security in the city. 6 9 3.0 1.4 1.0 3.0 5.0
Did you attend the Chiefs parade on February 14, 2024? 3 5 1.7 0.4 1.0 2.0 2.0
If the Chiefs win the Superbowl again, will you attend the Chief’s parade in 2025? 3 7 1.7 0.5 1.0 2.0 2.0
Your Education Level: 10 4 5.7 2.2 1.0 6.0 9.0
Mother's Education Level: 10 13 4.6 2.2 1.0 4.0 9.0
Father's Education Level: 10 21 4.7 2.4 1.0 4.0 9.0
Family and Household Relationship Status: 9 8 3.0 2.1 1.0 2.0 8.0
Are you a parent, step-parent, or guardian? 3 3 1.4 0.5 1.0 1.0 2.0
Employment of Other Adults in the Household: 3 16 1.5 0.5 1.0 1.0 2.0
Home Ownership: - Selected Choice 5 7 1.5 0.6 1.0 1.0 4.0
Number of times moved in the past 5 years: 10 8 1.9 1.5 1.0 1.0 12.0
Imagine the most religious person you know. How religious would you say they are? 5 11 3.6 0.8 1.0 4.0 4.0
Compared to that person, how religious would you say that you are? 6 11 2.6 1.2 1.0 2.0 5.0
Imagine the most spiritual person you know. How spiritual would you say they are? 5 14 3.5 0.8 1.0 4.0 4.0
Compared to that person, how spiritual would you say that you are? 6 15 3.0 1.2 1.0 3.0 5.0
Are any firearms now kept in or around your home? 5 3 1.8 0.7 1.0 2.0 4.0
Are any of these firearms now loaded? 5 71 1.7 0.7 1.0 2.0 4.0
Are any of these loaded firearms also unlocked? 5 72 1.9 0.7 1.0 2.0 4.0
Which of these income brackets is closest to the total household income in your home? 18 5 10.3 4.8 1.0 11.0 17.0
How would you evaluate your current economic status relative to others at this time? - My economic status is... 6 12 3.0 1.1 1.0 3.0 5.0
employment 2 98 10.0 0.0 10.0 10.0 10.0
othadult_employ 2 87 10.0 0.0 10.0 10.0 10.0

There are clearly some variables with relatively large proportion missing observations. This should artificially inflate the likelihood of duplication. But, as we said above, we’re comfortable with this more conservative approach for our first pass at the data.

4.0.2 Exact Duplication

A basic check most reputable survey research firms would perform is to look for exact duplicates in their responses. This is easily done in R with the get_dupes() function from the “janitor” package.

Show code
kc_exactdupe <- kc_combsurv_noad %>%
  get_dupes(-c(ID))

There were two “exact duplicates” found in our cleaned and trimmed data set. One useful feature of the get_dupes() function is that it (re)creates an ID variable that tells you which IDs from the original data set are exact duplicates. In this case it was the following IDs: 54 , 55. Let’s print those two observations from the original combined data to try to see what’s going on:

Show code
kc_combsurv_clean %>%
  filter(ID == 54 | ID == 55) %>%
  zap_label() %>%
  gt() 
ID survey Q66 INTERVIEWER NBHD Q115 Q0_Id Q0_Name Q0_Size Q0_Type Q1 Q1_1_TEXT Q2 Q116_Id Q116_Name Q116_Size Q116_Type Q2.1 Q2.2 Q2.3 Q3 Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 Q3.6 Q3.7 Q3.8 Q3.9 Q4_1 Q4_2 Q4_3 Q4_4 Q4_5 Q5 Q6 Q7_1 Q7_2 Q7_3 Q7_4 Q7_5 Q7_6 Q7_7 Q7_8 Q7_9 Q7_10 Q7_11 Q7_12 Q8_1 Q8_2 Q8_3 Q8_4 Q8_5 Q8_6 Q8_7 Q8_8 Q8_9 Q8_10 Q8_11 Q8_12 Q8_13 Q8_14 Q8_15 Q8_16 Q8_17 Q8_18 Q8_19 Q8_20 Q9_1 Q9_2 Q9_3 Q9_4 Q9_5 Q10_1 Q10_2 Q10_3 Q10_4 Q11_1 Q11_2 Q11_3 Q11_4 Q11_5 Q11_6 Q12_1 Q12_2 Q12_3 Q12_4 Q12_5 Q12_6 Q12_7 Q12_8 Q12_9 Q12_10 Q13 Q14 Q15 Q16_1 Q16_2 Q16_3 Q16_4 Q16_5 Q16_6 Q16_7 Q16_8 Q16_9 Q16_10 Q17_1 Q17_2 Q17_3 Q17_4 Q17_5 Q17_6 Q18 Q18.1 Q18.2 Q18.3 Q18.4 Q18.5 Q18.6 Q18.7 Q19_1 Q19_2 Q19_3 Q19_4 Q19_5 Q19_6 Q19_7 Q20_1 Q20_2 Q20_3 Q20_4 Q20_5 Q20_6 Q21_1 Q21_2 Q21_3 Q21_4 Q21_5 Q21_6 Q21_7 Q21_8 Q22_1 Q22_2 Q22_3 Q22_4 Q22_5 Q22_6 Q22_7 Q23_1 Q23_2 Q23_3 Q23_4 Q23_5 Q23_6 Q23_7 Q23_8 Q23_9 Q23_10 Q23_11 Q24_1 Q24_2 Q24_3 Q24_4 Q24_5 Q24_6 Q24_7 Q25_1 Q25_2 Q25_3 Q25_4 Q25_5 Q25_6 Q25_7 Q25_8 Q25_9 Q26 Q27 Q28 Q29_1 Q29_2 Q29_3 Q29_4 Q29_5 Q29_6 Q29_7 Q29_8 Q29_9 Q29_10 Q29_11 Q29_12 Q29_13 Q29_14 Q29_15 Q29_16 Q29_17 Q30 Q31_10 Q31_11 Q31_12 Q31_13 Q31_14 Q31_15 Q31_16 Q31_17 Q31_18 Q31_18_TEXT Q32 Q33 Q34 Q35 Q36 Q37 Q38 Q39 Q40 Q41 Q41_5_TEXT Q42_1 Q42_2 Q42_3 Q42_4 Q42_5 Q42_6 Q42_7 Q42_6_TEXT Q42_7_TEXT Q43 Q44 Q45 Q46 Q47 Q48 Q49 Q50 Q51_1 Q51_2 Q51_3 Q51_4 Q51_5 Q51_6 Q51_7 Q51_8 Q51_9 Q51_10 Q51_10_TEXT Q52 Q52.2_1 Q52.2_2 Q52.2_3 Q52.2_4 Q52.2_5 Q52.2_6 Q52.2_7 Q52.2_8 Q52.2_9 Q52.2_10 Q52.2_10_TEXT Q53 Q53_4_TEXT Q54 Q55 Q56 Q57 Q58 Q59 Q60 Q61 Q62 Q63 Q64_1 Q31_21 SurveyDate Duration__in_seconds_ Finished Q122
54 Main - English 1 14 29 F_117ofyoZMHZF3IW image.jpg 1473134 image/jpeg 1 Wornell Homestead Agrees with city NA NA NA NA 1 1 2 1 4 NA 1 2 2 1 2 2 2 2 2 1 1 2 2 NA 1 3 2 2 2 2 2 2 2 2 2 5 1 1 5 1 1 1 3 5 1 1 5 2 1 1 4 2 2 1 1 2 2 1 3 NA NA 4 2 2 3 NA 1 NA 1 1 1 1 1 1 1 1 2 1 2 5 4 5 3 5 5 5 5 5 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 1 1 1 5 1 5 1 1 1 1 1 1 2 3 2 2 3 3 3 1 2 4 3 4 4 3 4 3 3 2 1 1 1 1 3 2 1 1 1 1 1 2 2 4 5 1 1 1 1 4 4 4 3 4 2 1 1 1 1 1 1 2 NA 1 NA 1 1 2 1 2 2 2 2 NA NA NA NA NA NA NA NA NA NA NA NA NA 2 1 2 2 1945 2 1 NA NA NA NA NA NA 2 7 3 NA 5 2 English NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 1 23 years 4 3 4 3 2 NA NA 10 2 NA NA NA NA NA
55 Main - English 1 14 29 W 59th Street to the North, Wornall Rd to West, E 63rd St to the South, and East barrier is slightly past Brookside Blvd NA 1 Wornall Homestead "Agrees with city" NA NA NA NA 1 1 2 1 4 NA 1 2 2 1 2 2 2 2 2 1 1 2 2 NA 1 3 2 2 2 2 2 2 2 2 2 5 1 1 5 1 1 1 3 5 1 1 5 2 1 1 4 2 2 1 1 2 2 1 3 NA NA 4 2 2 3 NA 1 NA 1 1 1 1 1 1 1 1 2 1 2 5 4 5 3 5 5 5 5 5 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 1 1 1 5 1 5 1 1 1 1 1 1 2 3 2 2 3 3 3 1 2 4 3 4 4 3 4 3 3 2 1 1 1 1 3 2 1 1 1 1 1 2 2 4 5 1 1 1 1 4 4 4 3 4 2 1 1 1 1 1 1 2 NA 1 NA 1 1 2 1 2 2 2 2 NA NA NA NA NA NA NA NA NA NA NA NA NA 2 1 2 2 1945 2 1 NA NA NA NA NA NA 2 7 3 NA 5 2 English NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 1 23 yrs 4 3 4 3 2 NA NA 10 2 NA NA NA NA NA

Notice that only one of these two observations has a neighborhood image. A plausible explanation here is that perhaps a single observation was unintentionally duplicated when adding the neighborhood image. We’re not sure exactly how this happened, but we’ll be sure to have Dr. Kotlaja look into this. Also, we’ll make note of the particular interviewer (#14), and see if their are any issues with “near duplication” in the next duplication check. But, before we move on, let’s make sure we didn’t induce this duplication in our initial data cleaning. To check this, we are simply going to pull the raw version of the main community survey and print the data for these two IDs.

Show code
read_sav(here("Data", "KC Community Survey_In-Person.sav")) %>%
  filter(ID == 54 | ID == 55) %>%
  select(1:20) %>%
  zap_label() %>%
  gt()
ID Q66 INTERVIEWER NBHD Q115 Q0_Id Q0_Name Q0_Size Q0_Type Q1 Q1_1_TEXT Q2 Q116_Id Q116_Name Q116_Size Q116_Type Q2.1 Q2.2 Q2.3 Q3
54 1 14 29 F_117ofyoZMHZF3IW image.jpg 1473134 image/jpeg 1 Wornell Homestead Agrees with city NA NA NA NA 1
55 1 14 29 W 59th Street to the North, Wornall Rd to West, E 63rd St to the South, and East barrier is slightly past Brookside Blvd NA 1 Wornall Homestead "Agrees with city" NA NA NA NA 1

Above, we only printed the first 20 columns so we can confirm that these are indeed the same duplicated observations as we identified above in our cleaned data.

4.0.3 Near Duplication

A more nuanced method for detecting potentially problematic responses is to analyze all pairwise comparisons and identify the maximum percent of questions for which each observation matches with any other observation in the data. Kuriakose and Robbins (2016) proposed this near duplication method and identified 85% or more maximum match as potentially fraudulent and worthy of further investigation.

In a blog post (Day et al., 2023) we created a function to perform the maximum percent match calculation. We’ll use that function here.

Show code
source("Day_percentmatch-function.R")

kc_combsurv_pctmatch <- percentmatch(kc_combsurv_noad, idvar = "ID", include_na = TRUE, timing = FALSE) %>%
  rename(ID = id,
         pct_match = pm,
         match_id = id_m) %>%
  left_join(kc_combsurv_noad) %>%
  relocate(pct_match, match_id, .after = last_col())

One of the first things we can do with this information is to simply calculate descriptive statistics and plot the distribution of the maximum percent/proportion match metric.

Show code
library(patchwork)
library(tinytable)

kc_pctmatch_skim <- kc_combsurv_pctmatch %>%
  select(pct_match) %>%
  datasummary_skim()
kc_pctmatch_skim 
Unique Missing Pct. Mean SD Min Median Max Histogram
pct_match 81 0 0.5 0.1 0.3 0.5 1.0
Show code
kc_pctmatch_hist <- ggplot(data = kc_combsurv_pctmatch, aes(x = pct_match)) + 
  geom_histogram(aes(y = after_stat(density)), 
                 color = "black", fill = "#CA2430", binwidth = 0.02)  +
  geom_vline(aes(xintercept = .85), linetype = "dashed") +
  scale_x_continuous(limits = c(0.2, 1), breaks = seq(.2, 1, .1)) + 
  labs(subtitle = "Kansas City Community Violence Survey",
       y = "Density", x = "Maximum Proportion Match") + 
  theme_minimal() +
  theme(panel.grid = element_blank())
kc_pctmatch_hist

A couple of things jump out from the above plot and descriptive statistics. First, the distribution is relatively symmetric with most of the maximum proportion match values at the median (and average) of 0.5. Second, while relatively symmetric, there is a right skew to the data with multiple outlier cases at the higher ends of maximum proportion match. Specifically, multiple observations have maximum proportion match values above the 0.85 cutoff identified by Kuriakose and Robbins (2016) as “likely fraud”. This means that these individual observations match on at least 174 items on the survey. Of course, we know that at least two of the observations are identical matches from the exact duplicates analysis from above.

We can identify the these individuals with a relatively simple function.

Show code
### Function to identify maximum proportion match above a particular cutpoint
percentmatch_cut <- function(data, idvar = ID , pmatchvar = pct_match, idmatchvar = match_id, 
                             cut = .85, decimal = 2, tabdecimal = 3,
                             tabtitle = '**Observations with "Near Duplicates"**'){
  
  df <- data %>%
    arrange(desc({{ pmatchvar }})) %>%
    filter(round({{ pmatchvar }}, digits = decimal) >= cut)

  tab <- df %>%
        select({{ idvar }}, {{ pmatchvar }}, {{ idmatchvar }}) %>%
    gt() %>%
    cols_label(
      {{ pmatchvar }} := "Prop. Match", 
      {{ idmatchvar }} := "Matching ID") %>%
    fmt_number(
      columns = {{ pmatchvar }},
      decimals = tabdecimal) %>%
    tab_header(
      title = md(tabtitle))
  
  pmatchcut_results <- lst(df, tab)
  return(pmatchcut_results) 
}
Show code
kc_pctmatch_cut <- percentmatch_cut(data = kc_combsurv_pctmatch)
kc_pctmatch_cut$tab
Observations with “Near Duplicates”
ID Prop. Match Matching ID
54 1.000 55
55 1.000 54
272 0.970 274
274 0.970 272
1 0.901 274
247 0.882 274
3 0.852 271
271 0.852 3

As you can see in the above we have 8 observations with maximum proportion match values above the 0.85 cutoff. Essentially you have five pairs of relatively large maximum match values. Two of these observations are the exact duplicates we identified above (ID #54 & #55). Of the remaining four pairs, 3 pairs are relatively close matches with the web-based survey to the main community survey (ID #1 & ID #274; ID #247 & ID #274; ID #3 & ID #271), with only one being a close match between two web-based surveys (ID #272 & ID #274). Also note that ID #274 serves as the maximum match for 3 other observations. To start getting a sense of what’s causing these large maximum match values, let’s simply print the data for each of these observation pairs

Show code
kc_combsurv_noad %>%
  filter(ID == 54 | ID == 55) %>%
  zap_label() %>%
  gt()
ID Q2.1 Q2.2 Q2.3 Q3 Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 Q3.6 Q3.7 Q3.8 Q3.9 Q4_1 Q4_2 Q4_3 Q4_4 Q4_5 Q5 Q6 Q7_1 Q7_2 Q7_3 Q7_4 Q7_5 Q7_6 Q7_7 Q7_8 Q7_9 Q7_10 Q7_11 Q7_12 Q8_1 Q8_2 Q8_3 Q8_4 Q8_5 Q8_6 Q8_7 Q8_8 Q8_9 Q8_10 Q8_11 Q8_12 Q8_13 Q8_14 Q8_15 Q8_16 Q8_17 Q8_18 Q8_19 Q8_20 Q9_1 Q9_2 Q9_3 Q9_4 Q9_5 Q10_1 Q10_2 Q10_3 Q10_4 Q11_1 Q11_2 Q11_3 Q11_4 Q11_5 Q11_6 Q12_1 Q12_2 Q12_3 Q12_4 Q12_5 Q12_6 Q12_7 Q12_8 Q12_9 Q12_10 Q13 Q16_1 Q16_2 Q16_3 Q16_4 Q16_5 Q16_6 Q16_7 Q16_8 Q16_9 Q16_10 Q17_1 Q17_2 Q17_3 Q17_4 Q17_5 Q17_6 Q18 Q18.1 Q18.2 Q18.3 Q18.4 Q18.5 Q18.6 Q18.7 Q19_1 Q19_2 Q19_3 Q19_4 Q19_5 Q19_6 Q19_7 Q20_1 Q20_2 Q20_3 Q20_4 Q20_5 Q20_6 Q21_1 Q21_2 Q21_3 Q21_4 Q21_5 Q21_6 Q21_7 Q21_8 Q22_1 Q22_2 Q22_3 Q22_4 Q22_5 Q22_6 Q22_7 Q23_1 Q23_2 Q23_3 Q23_4 Q23_5 Q23_6 Q23_7 Q23_8 Q23_9 Q23_10 Q23_11 Q24_1 Q24_2 Q24_3 Q24_4 Q24_5 Q24_6 Q24_7 Q25_1 Q25_2 Q25_3 Q25_4 Q25_5 Q25_6 Q25_7 Q25_8 Q25_9 Q26 Q29_1 Q29_2 Q29_3 Q29_4 Q29_5 Q29_6 Q29_7 Q29_8 Q29_9 Q29_10 Q29_11 Q29_12 Q29_13 Q29_14 Q29_15 Q29_16 Q29_17 Q30 Q32 Q33 Q34 Q35 Q36 Q37 Q38 Q39 Q40 Q44 Q45 Q46 Q47 Q48 Q52 Q53 Q54 Q56 Q57 Q58 Q59 Q60 Q61 Q62 Q63 Q64_1 employment othadult_employ
54 NA NA NA 1 1 2 1 4 NA 1 2 2 1 2 2 2 2 2 1 1 2 2 NA 1 3 2 2 2 2 2 2 2 2 2 5 1 1 5 1 1 1 3 5 1 1 5 2 1 1 4 2 2 1 1 2 2 1 3 NA NA 4 2 2 3 NA 1 NA 1 1 1 1 1 1 1 1 2 1 2 5 4 5 3 5 5 5 5 5 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 1 1 1 5 1 5 1 1 1 1 1 1 2 3 2 2 3 3 3 1 2 4 3 4 4 3 4 3 3 2 1 1 1 1 3 2 1 1 1 1 1 2 2 4 5 1 1 1 1 4 4 4 3 4 2 1 1 1 1 1 1 2 NA 1 NA 1 1 2 1 2 2 2 2 NA NA NA NA 2 1 2 2 1945 7 3 NA 5 2 NA 1 1 4 3 4 3 2 NA NA 10 2 NA NA
55 NA NA NA 1 1 2 1 4 NA 1 2 2 1 2 2 2 2 2 1 1 2 2 NA 1 3 2 2 2 2 2 2 2 2 2 5 1 1 5 1 1 1 3 5 1 1 5 2 1 1 4 2 2 1 1 2 2 1 3 NA NA 4 2 2 3 NA 1 NA 1 1 1 1 1 1 1 1 2 1 2 5 4 5 3 5 5 5 5 5 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 1 1 1 5 1 5 1 1 1 1 1 1 2 3 2 2 3 3 3 1 2 4 3 4 4 3 4 3 3 2 1 1 1 1 3 2 1 1 1 1 1 2 2 4 5 1 1 1 1 4 4 4 3 4 2 1 1 1 1 1 1 2 NA 1 NA 1 1 2 1 2 2 2 2 NA NA NA NA 2 1 2 2 1945 7 3 NA 5 2 NA 1 1 4 3 4 3 2 NA NA 10 2 NA NA
Show code
kc_combsurv_noad %>%
  filter(ID == 272 | ID == 274) %>%
  zap_label() %>%
  gt()
ID Q2.1 Q2.2 Q2.3 Q3 Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 Q3.6 Q3.7 Q3.8 Q3.9 Q4_1 Q4_2 Q4_3 Q4_4 Q4_5 Q5 Q6 Q7_1 Q7_2 Q7_3 Q7_4 Q7_5 Q7_6 Q7_7 Q7_8 Q7_9 Q7_10 Q7_11 Q7_12 Q8_1 Q8_2 Q8_3 Q8_4 Q8_5 Q8_6 Q8_7 Q8_8 Q8_9 Q8_10 Q8_11 Q8_12 Q8_13 Q8_14 Q8_15 Q8_16 Q8_17 Q8_18 Q8_19 Q8_20 Q9_1 Q9_2 Q9_3 Q9_4 Q9_5 Q10_1 Q10_2 Q10_3 Q10_4 Q11_1 Q11_2 Q11_3 Q11_4 Q11_5 Q11_6 Q12_1 Q12_2 Q12_3 Q12_4 Q12_5 Q12_6 Q12_7 Q12_8 Q12_9 Q12_10 Q13 Q16_1 Q16_2 Q16_3 Q16_4 Q16_5 Q16_6 Q16_7 Q16_8 Q16_9 Q16_10 Q17_1 Q17_2 Q17_3 Q17_4 Q17_5 Q17_6 Q18 Q18.1 Q18.2 Q18.3 Q18.4 Q18.5 Q18.6 Q18.7 Q19_1 Q19_2 Q19_3 Q19_4 Q19_5 Q19_6 Q19_7 Q20_1 Q20_2 Q20_3 Q20_4 Q20_5 Q20_6 Q21_1 Q21_2 Q21_3 Q21_4 Q21_5 Q21_6 Q21_7 Q21_8 Q22_1 Q22_2 Q22_3 Q22_4 Q22_5 Q22_6 Q22_7 Q23_1 Q23_2 Q23_3 Q23_4 Q23_5 Q23_6 Q23_7 Q23_8 Q23_9 Q23_10 Q23_11 Q24_1 Q24_2 Q24_3 Q24_4 Q24_5 Q24_6 Q24_7 Q25_1 Q25_2 Q25_3 Q25_4 Q25_5 Q25_6 Q25_7 Q25_8 Q25_9 Q26 Q29_1 Q29_2 Q29_3 Q29_4 Q29_5 Q29_6 Q29_7 Q29_8 Q29_9 Q29_10 Q29_11 Q29_12 Q29_13 Q29_14 Q29_15 Q29_16 Q29_17 Q30 Q32 Q33 Q34 Q35 Q36 Q37 Q38 Q39 Q40 Q44 Q45 Q46 Q47 Q48 Q52 Q53 Q54 Q56 Q57 Q58 Q59 Q60 Q61 Q62 Q63 Q64_1 employment othadult_employ
272 2 2 2 3 2 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
274 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Show code
kc_combsurv_noad %>%
  filter(ID == 1 | ID == 274) %>%
  zap_label() %>%
  gt()
ID Q2.1 Q2.2 Q2.3 Q3 Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 Q3.6 Q3.7 Q3.8 Q3.9 Q4_1 Q4_2 Q4_3 Q4_4 Q4_5 Q5 Q6 Q7_1 Q7_2 Q7_3 Q7_4 Q7_5 Q7_6 Q7_7 Q7_8 Q7_9 Q7_10 Q7_11 Q7_12 Q8_1 Q8_2 Q8_3 Q8_4 Q8_5 Q8_6 Q8_7 Q8_8 Q8_9 Q8_10 Q8_11 Q8_12 Q8_13 Q8_14 Q8_15 Q8_16 Q8_17 Q8_18 Q8_19 Q8_20 Q9_1 Q9_2 Q9_3 Q9_4 Q9_5 Q10_1 Q10_2 Q10_3 Q10_4 Q11_1 Q11_2 Q11_3 Q11_4 Q11_5 Q11_6 Q12_1 Q12_2 Q12_3 Q12_4 Q12_5 Q12_6 Q12_7 Q12_8 Q12_9 Q12_10 Q13 Q16_1 Q16_2 Q16_3 Q16_4 Q16_5 Q16_6 Q16_7 Q16_8 Q16_9 Q16_10 Q17_1 Q17_2 Q17_3 Q17_4 Q17_5 Q17_6 Q18 Q18.1 Q18.2 Q18.3 Q18.4 Q18.5 Q18.6 Q18.7 Q19_1 Q19_2 Q19_3 Q19_4 Q19_5 Q19_6 Q19_7 Q20_1 Q20_2 Q20_3 Q20_4 Q20_5 Q20_6 Q21_1 Q21_2 Q21_3 Q21_4 Q21_5 Q21_6 Q21_7 Q21_8 Q22_1 Q22_2 Q22_3 Q22_4 Q22_5 Q22_6 Q22_7 Q23_1 Q23_2 Q23_3 Q23_4 Q23_5 Q23_6 Q23_7 Q23_8 Q23_9 Q23_10 Q23_11 Q24_1 Q24_2 Q24_3 Q24_4 Q24_5 Q24_6 Q24_7 Q25_1 Q25_2 Q25_3 Q25_4 Q25_5 Q25_6 Q25_7 Q25_8 Q25_9 Q26 Q29_1 Q29_2 Q29_3 Q29_4 Q29_5 Q29_6 Q29_7 Q29_8 Q29_9 Q29_10 Q29_11 Q29_12 Q29_13 Q29_14 Q29_15 Q29_16 Q29_17 Q30 Q32 Q33 Q34 Q35 Q36 Q37 Q38 Q39 Q40 Q44 Q45 Q46 Q47 Q48 Q52 Q53 Q54 Q56 Q57 Q58 Q59 Q60 Q61 Q62 Q63 Q64_1 employment othadult_employ
1 3 4 3 NA NA NA NA 4 NA 4 NA 4 NA 2 1 1 1 1 1 1 2 1 1 2 2 2 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
274 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Show code
kc_combsurv_noad %>%
  filter(ID == 247 | ID == 274) %>%
  zap_label() %>%
  gt()
ID Q2.1 Q2.2 Q2.3 Q3 Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 Q3.6 Q3.7 Q3.8 Q3.9 Q4_1 Q4_2 Q4_3 Q4_4 Q4_5 Q5 Q6 Q7_1 Q7_2 Q7_3 Q7_4 Q7_5 Q7_6 Q7_7 Q7_8 Q7_9 Q7_10 Q7_11 Q7_12 Q8_1 Q8_2 Q8_3 Q8_4 Q8_5 Q8_6 Q8_7 Q8_8 Q8_9 Q8_10 Q8_11 Q8_12 Q8_13 Q8_14 Q8_15 Q8_16 Q8_17 Q8_18 Q8_19 Q8_20 Q9_1 Q9_2 Q9_3 Q9_4 Q9_5 Q10_1 Q10_2 Q10_3 Q10_4 Q11_1 Q11_2 Q11_3 Q11_4 Q11_5 Q11_6 Q12_1 Q12_2 Q12_3 Q12_4 Q12_5 Q12_6 Q12_7 Q12_8 Q12_9 Q12_10 Q13 Q16_1 Q16_2 Q16_3 Q16_4 Q16_5 Q16_6 Q16_7 Q16_8 Q16_9 Q16_10 Q17_1 Q17_2 Q17_3 Q17_4 Q17_5 Q17_6 Q18 Q18.1 Q18.2 Q18.3 Q18.4 Q18.5 Q18.6 Q18.7 Q19_1 Q19_2 Q19_3 Q19_4 Q19_5 Q19_6 Q19_7 Q20_1 Q20_2 Q20_3 Q20_4 Q20_5 Q20_6 Q21_1 Q21_2 Q21_3 Q21_4 Q21_5 Q21_6 Q21_7 Q21_8 Q22_1 Q22_2 Q22_3 Q22_4 Q22_5 Q22_6 Q22_7 Q23_1 Q23_2 Q23_3 Q23_4 Q23_5 Q23_6 Q23_7 Q23_8 Q23_9 Q23_10 Q23_11 Q24_1 Q24_2 Q24_3 Q24_4 Q24_5 Q24_6 Q24_7 Q25_1 Q25_2 Q25_3 Q25_4 Q25_5 Q25_6 Q25_7 Q25_8 Q25_9 Q26 Q29_1 Q29_2 Q29_3 Q29_4 Q29_5 Q29_6 Q29_7 Q29_8 Q29_9 Q29_10 Q29_11 Q29_12 Q29_13 Q29_14 Q29_15 Q29_16 Q29_17 Q30 Q32 Q33 Q34 Q35 Q36 Q37 Q38 Q39 Q40 Q44 Q45 Q46 Q47 Q48 Q52 Q53 Q54 Q56 Q57 Q58 Q59 Q60 Q61 Q62 Q63 Q64_1 employment othadult_employ
247 NA NA NA NA NA NA NA NA NA NA NA NA NA 2 2 2 2 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2 NA NA NA NA NA 4 NA NA NA NA NA NA NA NA NA NA NA 2 2 NA 2 NA 2 NA NA NA NA NA NA NA NA 3 3 3 3 3 3 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2 NA NA NA NA NA NA NA NA 1 NA NA NA 2 NA NA 1 NA NA NA NA 3 NA NA 16 NA NA NA
274 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Show code
kc_combsurv_noad %>%
  filter(ID == 3 | ID == 271) %>%
  zap_label() %>%
  gt()
ID Q2.1 Q2.2 Q2.3 Q3 Q3.1 Q3.2 Q3.3 Q3.4 Q3.5 Q3.6 Q3.7 Q3.8 Q3.9 Q4_1 Q4_2 Q4_3 Q4_4 Q4_5 Q5 Q6 Q7_1 Q7_2 Q7_3 Q7_4 Q7_5 Q7_6 Q7_7 Q7_8 Q7_9 Q7_10 Q7_11 Q7_12 Q8_1 Q8_2 Q8_3 Q8_4 Q8_5 Q8_6 Q8_7 Q8_8 Q8_9 Q8_10 Q8_11 Q8_12 Q8_13 Q8_14 Q8_15 Q8_16 Q8_17 Q8_18 Q8_19 Q8_20 Q9_1 Q9_2 Q9_3 Q9_4 Q9_5 Q10_1 Q10_2 Q10_3 Q10_4 Q11_1 Q11_2 Q11_3 Q11_4 Q11_5 Q11_6 Q12_1 Q12_2 Q12_3 Q12_4 Q12_5 Q12_6 Q12_7 Q12_8 Q12_9 Q12_10 Q13 Q16_1 Q16_2 Q16_3 Q16_4 Q16_5 Q16_6 Q16_7 Q16_8 Q16_9 Q16_10 Q17_1 Q17_2 Q17_3 Q17_4 Q17_5 Q17_6 Q18 Q18.1 Q18.2 Q18.3 Q18.4 Q18.5 Q18.6 Q18.7 Q19_1 Q19_2 Q19_3 Q19_4 Q19_5 Q19_6 Q19_7 Q20_1 Q20_2 Q20_3 Q20_4 Q20_5 Q20_6 Q21_1 Q21_2 Q21_3 Q21_4 Q21_5 Q21_6 Q21_7 Q21_8 Q22_1 Q22_2 Q22_3 Q22_4 Q22_5 Q22_6 Q22_7 Q23_1 Q23_2 Q23_3 Q23_4 Q23_5 Q23_6 Q23_7 Q23_8 Q23_9 Q23_10 Q23_11 Q24_1 Q24_2 Q24_3 Q24_4 Q24_5 Q24_6 Q24_7 Q25_1 Q25_2 Q25_3 Q25_4 Q25_5 Q25_6 Q25_7 Q25_8 Q25_9 Q26 Q29_1 Q29_2 Q29_3 Q29_4 Q29_5 Q29_6 Q29_7 Q29_8 Q29_9 Q29_10 Q29_11 Q29_12 Q29_13 Q29_14 Q29_15 Q29_16 Q29_17 Q30 Q32 Q33 Q34 Q35 Q36 Q37 Q38 Q39 Q40 Q44 Q45 Q46 Q47 Q48 Q52 Q53 Q54 Q56 Q57 Q58 Q59 Q60 Q61 Q62 Q63 Q64_1 employment othadult_employ
3 2 2 2 2 1 4 NA 4 NA 4 NA 4 NA 2 2 2 2 2 2 NA 4 2 4 2 3 NA 3 5 3 2 5 2 2 5 4 2 2 5 2 4 4 2 3 NA 3 1 3 5 3 4 2 4 4 1 NA 3 NA 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
271 2 2 1 2 2 4 NA 4 NA 4 NA 4 NA 2 2 2 2 2 3 3 4 2 2 2 3 4 4 5 4 3 4 2 3 3 5 2 5 1 3 4 5 1 1 5 3 2 2 1 3 5 5 5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

As is evident in the above comparisons, other than the exact duplicate pair, the others are driven by observations with lots of missing data. The reason ID #274 is a close match to three other observations is because ID #274 is a completely missing observation. So this is why Kuriakose and Robbins (2016) recommend removing variables with 10% or more missing observations and observations with 25% or more missing variables.

To go back to perhaps where we should have started, we can actually identify those observations with large amounts of missing data.

Show code
kc_combsurv_rowmiss <- kc_combsurv_noad %>%
  select(-Q40) %>%
  mutate(
    rowmiss_n = rowSums(is.na(select(., -ID))),
    rowmiss = rowmiss_n / (ncol(.) - 1)
  )


kc_combsurv_rowmiss %>%
  zap_label() %>%
  select(ID, rowmiss) %>%
  arrange(desc(rowmiss)) %>%
  filter(rowmiss > .25) %>%
  gt() %>%
  cols_label(
    rowmiss = "% Missing") %>%
  fmt_percent(
    columns = rowmiss,
    decimals = 1) %>%
  tab_header(
    title = md("**Observations by % Missing**")) %>%
  tab_options(
    container.height = px(500),
    container.overflow.y = TRUE
  )
Observations by % Missing
ID % Missing
274 100.0%
272 97.0%
1 90.1%
247 88.2%
271 76.4%
3 75.9%
276 69.0%
270 63.1%
2 53.7%
178 46.8%
5 45.8%
273 44.3%
254 41.9%
261 39.9%
22 39.4%
317 35.0%
166 34.5%
121 33.5%
44 32.5%
256 31.5%
167 31.0%
24 29.6%
259 29.6%
85 28.6%
156 28.6%
370 28.6%
255 27.6%
252 27.1%
92 25.1%
152 25.1%
165 25.1%

As you can see above, there are 31 observations with more than 25% missing data. If you look at the IDs with relatively large maximum proportion match values, you’ll see them concentrated at the extreme end of missing data:

Show code
kc_combsurv_rowmiss %>%
  zap_label() %>%
  select(ID, rowmiss) %>%
  arrange(desc(rowmiss)) %>%
  filter(ID %in% c(54, 55, 272, 274, 1, 247, 3, 271)) %>%
  gt() %>%
  cols_label(
    rowmiss = "% Missing") %>%
  fmt_percent(
    columns = rowmiss,
    decimals = 1) %>%
  tab_header(
    title = md("**Observations by % Missing**"))
Observations by % Missing
ID % Missing
274 100.0%
272 97.0%
1 90.1%
247 88.2%
271 76.4%
3 75.9%
54 17.2%
55 17.2%

With the exception of the exact duplicates, all of the other observations with potentially problematic maximum proportion match values, appear to be driven by missing data.

There are other procedures we could run to more fully check for duplication (see Andrew Wheeler’s blog post). But we are going to stop here for now as we are confident that, with the exception of two observations that were exact duplicates (ID #54 & #55), duplicate results do not appear to be a problem in these data. Missing data on the other hand, may be a problem.

4.0.4 Conclusion

This duplication analysis of the Kansas City Community Survey data provides reassuring evidence about the overall integrity of the data collection process while highlighting the importance of systematic quality checks in survey research. Our comprehensive examination using both exact duplication detection and the Kuriakose and Robbins (2016) near-duplication methodology reveals that potential data quality issues are minimal and largely attributable to technical rather than methodological problems.

4.0.4.1 Key Findings

The exact duplication analysis identified only one problematic case—two observations (ID #54 and #55) that were completely identical across all substantive variables. This represents less than 1% of the total sample and appears to be a technical duplication rather than fraudulent data entry, as evidenced by the fact that this duplication existed in the original raw data file. The involvement of interviewer #14 in this case warrants follow-up investigation, but the isolated nature of this incident suggests it was an anomaly rather than a systematic problem.

The near-duplication analysis initially flagged eight observations with maximum proportion matches above the 85% threshold suggested by Kuriakose and Robbins (2016) as potentially fraudulent. However, detailed examination revealed that six of these cases were artifacts of missing data patterns rather than genuine duplication concerns. Specifically, observations with extensive missing data (particularly ID #274, which was completely missing) artificially inflated similarity scores with other observations that also had substantial missing data.

4.0.4.2 Methodological Insights

This analysis reinforces the importance of Kuriakose and Robbins’ recommendations for preprocessing data before conducting near-duplication analyses. The concentration of flagged observations among those with more than 25% missing data demonstrates how missing values can create false positives in duplication detection.

The fact that three of the near-duplicate pairs involved comparisons between the web-based survey and the main community survey (ID #1 vs. #274, ID #247 vs. #274, and ID #3 vs. #271) suggests that missing data patterns may differ systematically between survey administration modes. This could reflect differences in question skip patterns, respondent engagement, or technical issues specific to the online platform.

4.0.4.3 Implications for Data Quality

The results provide strong evidence that the community-based interviewing approach did not introduce systematic data quality problems related to fraudulent or duplicated responses. The use of community members as interviewers—a methodological choice that could theoretically increase risks of data fabrication—appears to have been successful in maintaining data integrity. This finding supports the viability of community-based data collection approaches for reaching underrepresented populations.

However, the analysis does highlight missing data as a more significant concern than duplication. The identification of 48 observations with more than 25% missing data (representing approximately 12% of the sample) suggests that response burden, survey length, or administration issues may have affected data completeness for some respondents. This missing data pattern warrants further investigation to understand whether it reflects systematic differences across neighborhoods, interviewers, or respondent characteristics.

4.0.4.4 Recommendations

Based on these findings, we recommend several actions for current and future survey efforts:

  1. Immediate follow-up: Investigate the technical cause of the exact duplication (ID #54 and #55) to prevent similar issues in future data collection efforts.

  2. Missing data analysis: Conduct a comprehensive examination of missing data patterns to identify whether missingness is related to interviewer effects, neighborhood characteristics, question sensitivity, or survey administration mode.

  3. Protocol refinement: Consider implementing additional quality control measures for future surveys, such as real-time data validation and systematic reviews of interview completion patterns.

  4. Methodological documentation: Document the successful use of community-based interviewing as a viable approach for high-quality data collection in underrepresented communities.

4.0.4.5 Final Assessment

This duplication analysis demonstrates that the Kansas City Community Survey achieved its data quality objectives. The minimal occurrence of true duplication, combined with the absence of systematic patterns that might suggest fraudulent data entry, validates both the community-based interviewing methodology and the overall survey implementation. While missing data emerges as an area for improvement, the core integrity of the collected data appears sound.

The successful completion of this quality assessment provides confidence in using these data for substantive analyses of community perceptions and police-community relations. More broadly, this analysis contributes to the growing evidence base supporting community-engaged research methodologies as viable alternatives to traditional survey approaches, particularly when seeking to engage populations that are typically underrepresented in municipal research and policy development.


  1. One character variable (Q40) that asked about the year in which the respondent was born, was answered relatively consistently and had relatively few missing observations. This variable was left in the data↩︎