Experiential Learning and Student Comfort with Forensic Experiences
Author
Jon Brauer
1 Introduction
This file is a data analysis supplement for the paper “Grossed out to Engrossed: Experiential Learning Shifts Student Attitudes on Forensic Entomology” (Vanessa R. Cooper, Krystal R. Hans, & Jonathan R. Brauer). The primary aim of the analysis is to estimate and visualize potential effects of traditional and experiential learning classroom experiences on students’ reported comfort with specific aspects encountered in forensics, such as insect collection or decomposition. This is accomplished by contrasting pre- and post-class ordinal responses to Likert-type survey questions about comfort with aspects of forensics tasks posed to students at beginning and end of forensic courses with and without an experiential learning lab component.
2 Load Libraries & Settings
I will start by loading various libraries that I often use for such projects.
Show code
# Packages I needlibrary(here)library(tidyverse)library(gt)library(janitor)library(vtable)library(modelsummary)library(easystats)library(marginaleffects)library(ggplot2)library(GGally)library(ggeffects)library(ggdist)library(patchwork)library(parallelly)library(bayesplot)library(ggdist)library(correlation)library(sjmisc)library(forcats)library(ggtext)library(ggmosaic)library(gtExtras)library(tidybayes)library(extrafont)library(stringr)library(ragg)# to use brms, do following:# install rtools 4.4 # install rstan# https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started# test ability to install packages from sources using Rtools: # install.packages("Rcpp", type="source")# it works# remove any existing rstan installs# remove.packages("rstan")# if (file.exists(".RData")) file.remove(".RData")# run the next line if you already have rstan installed# remove.packages(c("StanHeaders", "rstan"))# install rstan# install.packages("rstan", repos = c('https://stan-dev.r-universe.dev', getOption("repos")))# test rstan install# example(stan_model, package = "rstan", run.dontrun = TRUE)# it works library(rstan)# install cmdstanr (alternative brms backend for interfacing w/stan)# install.packages("cmdstanr", repos = c('https://stan-dev.r-universe.dev', getOption("repos")))# newest developmental version# install.packages("remotes")# remotes::install_github("stan-dev/cmdstanr", force=TRUE)# Sys.getenv("RTOOLS44_HOME")# Sys.getenv("PATH")# readLines("~/.Renviron")# had to delete old .Reviron fie pointing to Rtools40 from Documents folder # cmdstanr::install_cmdstan()# cmdstanr::check_cmdstan_toolchain(fix = TRUE)# library(cmdstanr)# cannot get cmdstanr to work - apparently do not have appropriate permissionslibrary(brms)print("T/F: Root 'here()' folder contains subfolder 'Models'")
#check for Output folder to save tables & figures, create if one does not existifelse(dir.exists(here("Output")), TRUE, dir.create(here("Output")))
[1] TRUE
I will also set some default settings.
Show code
#Fonts# First time setup (run this once in console, not in your document):# Import and register fonts (only needed once per system)# Uncomment if you haven't imported fonts before# extrafont::font_import()# If you just want Times New Roman, run this once instead # extrafont::font_import(pattern = "Times New Roman")# Load the fonts into the R sessionextrafont::loadfonts(device ="win", quiet =TRUE) # For Windows# extrafont::loadfonts(device = "pdf", quiet = TRUE) # For PDF output# extrafont::loadfonts(device = "postscript", quiet = TRUE) # For PostScript output# If you only imported Times New Roman, load this instead: # Load the specific font you need for this session# if(!("Times New Roman" %in% windowsFonts())) {# windowsFonts(`Times New Roman` = windowsFont("Times New Roman"))# }#To improve font rendering in viewer:# Configure knitr to use the ragg device which handles fonts betterknitr::opts_chunk$set(dev ="ragg_png",dpi =300)# Set the default theme for all ggplot2 plotstheme_set(theme_minimal(base_family ="Times New Roman"))#set default knit & scientific notation optionsknitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE, fig.align="center")options(scipen =999, digits =2) #set global option to two decimals # identify & set # of available cores# replaced parallel::detectCores()# https://www.jottr.org/2022/12/05/avoid-detectcores/nCoresphys <- parallelly::availableCores(logical =FALSE) options(mc.cores = parallelly::availableCores(logical=FALSE))#options(mc.cores = 4, logical=FALSE)#alt: can specify exact number of cores desired here &/or in models/ppchecks
Now we are ready to load and skim the data.
3 Data
3.1 Load data
Show code
dat <- readxl::read_xlsx(here("Data/fi_outcomes.xlsx"))
Based on the data structure and descriptive statistics, it appears we have pre/post data from 330 unique individuals on five different outcome questions. N=200 observations were from a course without a lab component and N=463 were from a course with a lab. Surveys were administered to students in classes across three different years (2022, 2023, 2024).
One initial issue is that there are N=663 observations, including N=332 “post” observations and N=331 “pre” observations, but we only have N=330 individuals. Let’s investigate.
Show code
# Count observations per IDobservation_counts <- dat %>%count(ID)# print(observation_counts)# Identify IDs with more or less than two observationsproblem_ids <- observation_counts %>%filter(n !=2)dat %>% dplyr::filter(ID %in%c("765333", "765499")) %>%gt()
ID
Year
Lab?
Survey
Insect Collection
Maggot Collection
Decomposition Fluid
Decomposition Smell
Death
765499
2022
Yes
Pre
1
1
1
1
1
765499
2022
Yes
Post
2
3
3
2
3
765333
2023
Yes
Pre
4
3
2
2
5
765333
2023
Yes
Post
3
3
3
2
4
765333
2023
Yes
Post
3
3
3
3
5
765499
2023
Yes
Pre
3
3
3
3
3
765499
2023
Yes
Post
3
2
2
2
2
There are two problematic student IDs. One ID has two non-duplicate “post” observations; I will drop this case for now as it is not clear which (if either) is valid.
The other problematic ID has what seems to be valid pre/post data from taking the class in 2022 and then taking it again in 2023. I will drop the second (2023) set of observations, as the treatment, if effective, presumably would have taken effect during/after the first time taking the course.
While wrangling data, I will also rename variables (e.g., simplify and remove spaces from outcomes).
Show code
dat <- dat %>% dplyr::filter(ID !=765333) %>%# Drop 765333 dplyr::filter( (ID !=765499) |# Keep all other IDs (ID ==765499& Year ==2022) # OR keep 765499 only if Year == 2022 )outcome_vars <-c("Insect Collection", "Maggot Collection", "Decomposition Fluid", "Decomposition Smell", "Death")outcomes_new <-c("insect", "maggot", "fluid", "smell", "death") dat <- dat %>%setNames(., ifelse(names(.) %in% outcome_vars, outcomes_new[match(names(.), outcome_vars)], names(.))) %>%rename(lab =`Lab?`, survey =`Survey`)
Skim filtered data
Show code
dat %>%datasummary_skim()
Unique
Missing Pct.
Mean
SD
Min
Median
Max
Histogram
ID
329
0
765548.8
136.7
765306.0
765552.0
765763.0
Year
3
0
2023.0
0.8
2022.0
2023.0
2024.0
insect
5
0
3.0
1.3
1.0
3.0
5.0
maggot
5
0
2.8
1.3
1.0
3.0
5.0
fluid
5
0
2.9
1.2
1.0
3.0
5.0
smell
5
0
2.9
1.3
1.0
3.0
5.0
death
5
0
3.4
1.1
1.0
3.0
5.0
N
%
lab
No
200
30.4
Yes
458
69.6
survey
Post
329
50.0
Pre
329
50.0
4 Crosstab
Let’s start by generating a table of proportions for each response category by pre/post and lab/no lab conditions.
With these plots, we can see an effect of being in class (post versus pre) in the “No Lab” condition and a larger effect in the “Lab” condition. However, we want a clear, more precise way to describe and estimate these effects. In this case, it might be especially helpful to know how the class and lab components affect the predicted probability that students will report a “4” or a “5” - that is, the effect on reporting they are “very” or “extremely” comfortable with each outcome (e.g., insect collection; decomposition smell).
To estimate these particular effects, we will need to build regression models of each outcome with survey (post = 1) and lab (=1) interacting as predictors.
Ultimately, we will rely on Bayesian ordinal regression models; ordinal models are appropriate for the outcome distributions and will increase the accuracy of our precise probability estimates, while Bayesian models will allow us to combine posterior predictions so we can visualize the specific effects of interest (e.g., the effects on reporting “4” or “5” on each outcome). Before we get there, though, let’s start with a simple linear “intercept only” regression model to show why it is important to use ordinal models.
6 Modeling
Before estimating a model, it is a good idea to view distributions of our key variables to ensure we specify appropriate link functions.
Some of these items exhibit approximately symmetric decay about the mean, so treating them as metric continuous outcomes might not result in too much bias. However, there is little downside to treating these as ordinal outcome variables instead. Let’s use “maggot” as an example below.
6.1 Ordinal: So What?
Building a regression model that treats ordinal variables as if they are metric continuous variables can generate model predictions that exhibit poor fit to the data and may result in substantial inference errors (e.g., see Liddell & Kruschke 2018).
In the first couple models below, I present linear and ordinal Bayesian “Intercept only” models to illustrate the poor fit of linear modeling to ordinal data and the substantial improvements we gain in predictive fit to underlying data distribution by using ordinal (e.g., cumulative probit) modeing instead.
6.2 Intercept Mod: Lin~1 model (poor fit)
I start by estimating a simple linear intercept model of maggot with no predictors to illustrate the poor fit of a linear model to the ordinal outcome. Here, I use Bayesian estimation with default priors; again, Bayesian models will be beneficial for model building and visualization later.
Show code
library(rstanarm)dat_mod <- dat %>%mutate(survey01 =if_else(survey =="Post", 1, 0),lab01 =if_else(lab =="Yes", 1, 0))#Manual caching for stan_glm - file & file_refit built into brms mod1 <-stan_glm(maggot ~1,family =gaussian(),data = dat,chains =4, cores = nCoresphys,seed =1138 )
Model Info:
function: stan_glm
family: gaussian [identity]
formula: maggot ~ 1
algorithm: sampling
sample: 4000 (posterior sample size)
priors: see help('prior_summary')
observations: 658
predictors: 1
Estimates:
mean sd 2.5% 97.5%
(Intercept) 2.84 0.05 2.74 2.94
sigma 1.32 0.04 1.25 1.40
Fit Diagnostics:
mean sd 2.5% 97.5%
mean_PPD 2.84 0.07 2.69 2.98
The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
MCMC diagnostics
mcse Rhat n_eff
(Intercept) 0.00 1.00 2816
sigma 0.00 1.00 2809
mean_PPD 0.00 1.00 3499
log-posterior 0.03 1.00 1611
For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
Show code
prior_summary(mod1) #check priors
Priors for model 'mod1'
------
Intercept (after predictors centered)
Specified prior:
~ normal(location = 2.8, scale = 2.5)
Adjusted prior:
~ normal(location = 2.8, scale = 3.3)
Auxiliary (sigma)
Specified prior:
~ exponential(rate = 1)
Adjusted prior:
~ exponential(rate = 0.76)
------
See help('prior_summary.stanreg') for more details
The linear (metric normal) model above recovers the mean of maggot (=2.8), which is what it is designed to do. However, the posterior predictive check plots (e.g., density; histogram; box) show that the underlying normal distribution assumption results in model predictions that fail to recover the observed ordinal item values.
6.3 Intercept Mod: Ord~1 (better fit)
Now, let’s compare how a basic cumulative probit model without predictors fits the ordinal outcome data. A cumulative probit model relaxes the equal intervals assumption found in linear regression, allowing our treatment effect to have different size effects as we move up ordinal categories. For instance, imagine classroom + lab experience improves the comfort of those in the “not comfortable” (=1) category making them more likely to report being “somewhat comfortable” but does not substantially improve those in the “somewhat comfortable” category. In that case, the linear regression model would average these effects and downwardly bias the overall estimated effect on our outcomes.
In contrast, a cumulative probit model maintains the monotonicity assumption (the probability of higher responses can only increase, not decrease) found in linear regression while allowing the size of treatment effects to vary across the ordinal scale. This is particularly important for comfort ratings where the psychological distance between “not comfortable” and “somewhat comfortable” may differ from the distance between “very comfortable” and “extremely comfortable.”
The Bayesian implementation offers additional advantages. It provides direct probability statements about effects (e.g., probability that lab experience increases comfort) and handles uncertainty in both thresholds and effects simultaneously. This is especially valuable when sample sizes differ across conditions or when exploring interaction effects between lab participation and time (pre/post). The posterior distributions also allow us to estimate the probability of any response pattern, which is particularly useful for understanding the practical significance of the lab intervention’s impact on student comfort levels.
6.3.1 Priors on ordinal response distribution
Instead of using program default prior distributions for our ordinal models, I will explicitly specify what is essentially a flat prior distribution on the response thresholds in the cumulative probit model. I do this by specifying values that equate to equal probabilities (=1/5 or 0.20) for each of the five response categories. Transforming these to cumulative normal distribution thresholds results in values of c(-.84, -.25, .25, .84).
I could specify somewhat more informative priors if we had good a priori information about typical distributions on these outcomes. One advantage (or drawback, depending on your perspective and aims) of specifying flat priors is that results will typically converge with those generated by traditional frequentist methods.
Show code
#Ordinal cumulative probit model of maggot #maggot is a 5-category ordinal variable (1 to 5)# table(dat$maggot)# develop formulabf.form = brms::bf(maggot ~1)# brms default prior is a wide student's t distget_prior(bf.form, data = dat_mod, family =cumulative('probit'))
Family: cumulative
Links: mu = probit; disc = identity
Formula: maggot ~ 1
Data: dat_mod (Number of observations: 658)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] -0.82 0.06 -0.93 -0.71 1.00 2984 2651
Intercept[2] -0.24 0.05 -0.34 -0.14 1.00 4416 3512
Intercept[3] 0.47 0.05 0.38 0.57 1.00 5111 3577
Intercept[4] 1.10 0.06 0.98 1.22 1.00 5043 3704
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 NA NA NA
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Show code
prior_summary(mod2) #check priors
prior class coef group resp dpar nlpar lb ub source
student_t(3, 0, 2.5) Intercept default
normal(-0.84, 1) Intercept 1 user
normal(-0.25, 1) Intercept 2 user
normal(0.25, 1) Intercept 3 user
normal(0.84, 1) Intercept 4 user
As expected, posterior predictions generated by the cumulative probit model are a much better fit to the ordinal response data.
Now, let’s add survey and lab as interacting predictors.
Ultimately, I will illustrate what I think are preferable ways of interpreting and visualizing results of ordinal models. For now, let’s examine default results of a cumulative probit model of maggot with our two key predictors.
6.4 Prediction Mod: Ord~Lin (outcome by survey*lab)
Show code
#Ordinal cumulative probit model # set priors: # equal likelihood for all outcome thresholds# weakly informative normal(0,1) for default betapriors2 =c(prior(normal(-0.84, 1), class = Intercept, coef =1),prior(normal(-0.25, 1), class = Intercept, coef =2),prior(normal(0.25, 1), class = Intercept, coef =3),prior(normal(0.84, 1), class = Intercept, coef =4), prior(normal(0, 1), class = b))#maggot - bivariate maggot by survey*lab:mod3 <-brm(data = dat_mod,family =cumulative("probit"), maggot ~1+ survey01*lab01,prior = priors2,iter =2000, warmup =1000, chains =4, cores = nCoresphys, seed =1138, file ="Models/mod3_fit",file_refit ="on_change" )
Family: cumulative
Links: mu = probit; disc = identity
Formula: maggot ~ 1 + survey01 * lab01
Data: dat_mod (Number of observations: 658)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] -0.47 0.11 -0.68 -0.26 1.00 2685 2955
Intercept[2] 0.14 0.11 -0.07 0.35 1.00 2806 2999
Intercept[3] 0.92 0.11 0.70 1.13 1.00 2752 3112
Intercept[4] 1.59 0.12 1.36 1.82 1.00 3175 3067
survey01 0.09 0.14 -0.19 0.37 1.00 2359 2852
lab01 0.23 0.12 -0.00 0.46 1.00 2770 2980
survey01:lab01 0.58 0.17 0.25 0.92 1.00 2459 2711
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 NA NA NA
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Show code
prior_summary(mod3) #check priors
prior class coef group resp dpar nlpar lb ub
normal(0, 1) b
normal(0, 1) b lab01
normal(0, 1) b survey01
normal(0, 1) b survey01:lab01
student_t(3, 0, 2.5) Intercept
normal(-0.84, 1) Intercept 1
normal(-0.25, 1) Intercept 2
normal(0.25, 1) Intercept 3
normal(0.84, 1) Intercept 4
source
user
(vectorized)
(vectorized)
(vectorized)
default
user
user
user
user
Let’s plot the classroom effect (post-pre) for lab and no lab conditions on the latent scale.
6.5 Plot: Latent scale effects
We will start with one outcome (maggot) and then iterate the process across all outcomes.
Show code
draws_df <-as_draws_df(mod3) %>%mutate(survey_effect_nolab = b_survey01,survey_effect_lab = b_survey01 +`b_survey01:lab01` )contrasts <-bind_rows( draws_df %>%transmute(effect = survey_effect_nolab) %>%median_qi(.width = .95) %>%mutate(condition ="No Lab"), draws_df %>%transmute(effect = survey_effect_lab) %>%median_qi(.width = .95) %>%mutate(condition ="Lab"))# Plotggplot(contrasts, aes(x = effect, y = condition, color = condition)) +geom_linerange(aes(xmin = .lower, xmax = .upper), linewidth =1) +geom_point(aes(color = condition), shape =124, size =4) +geom_vline(xintercept =0, linetype ="dashed") +scale_color_viridis_d(option ="D", begin =0, end =1) +labs(x ="Effect of Post vs Pre on Latent Scale (est + 95% CrI)",y ="") +theme_minimal() +theme(legend.position ="none",panel.grid =element_blank(),axis.line =element_line(color ="black"),axis.ticks =element_line(color ="black") )
It appears the class is particularly helpful for increasing student comfort with maggot collection when it is coupled with a lab component, but may not be as helpful without that component.
Now, let’s iterate this process across all outcomes.
6.5.1 Plot: Latent y scale contrasts, all outcomes
We see similar patterns for all five outcomes, and the plot represents a conditional average treatment effect of class experience (conditional on lab or no lab condition) on latent continuous outcome scales.
We can also pull the key information from regression output into a single table.
Table 2. Regression Coefficients from Bayesian Ordinal Models
Estimated effects of classroom experience (post-pre) and lab on latent continuous response scale
Coefficient
Insect Collection
Maggot Collection
Decomposition Fluid
Decomposition Smell
Death
Est.
95% CrI
Est.
95% CrI
Est.
95% CrI
Est.
95% CrI
Est.
95% CrI
Effects
Post (vs. Pre)
0.21
[−0.08, 0.49]
0.09
[−0.19, 0.37]
0.11
[−0.17, 0.39]
0.10
[−0.19, 0.37]
0.09
[−0.19, 0.38]
Lab (vs. No Lab)
0.37
[0.12, 0.61]
0.23
[0.00, 0.46]
0.14
[−0.09, 0.38]
0.29
[0.05, 0.53]
0.00
[−0.25, 0.25]
Post*Lab Interaction
0.44
[0.10, 0.78]
0.58
[0.25, 0.92]
0.57
[0.24, 0.91]
0.50
[0.18, 0.83]
0.40
[0.06, 0.75]
Threshold
Intercept[1]
−0.49
[−0.70, −0.27]
−0.47
[−0.68, −0.26]
−0.74
[−0.97, −0.53]
−0.61
[−0.81, −0.39]
−1.34
[−1.57, −1.11]
Intercept[2]
0.15
[−0.06, 0.37]
0.14
[−0.07, 0.35]
−0.02
[−0.23, 0.19]
0.14
[−0.07, 0.34]
−0.65
[−0.86, −0.44]
Intercept[3]
0.88
[0.66, 1.10]
0.92
[0.70, 1.13]
0.86
[0.64, 1.07]
0.91
[0.69, 1.12]
0.29
[0.09, 0.51]
Intercept[4]
1.65
[1.42, 1.89]
1.59
[1.36, 1.82]
1.61
[1.38, 1.84]
1.68
[1.45, 1.92]
1.11
[0.89, 1.33]
If you are used to traditional (frequentist) regression approaches, then you might be surprised not to see p-values in any of the output or tables like the one above. This is because the Bayesian regression approach used here does not rely on null hypothesis significance testing to generate confidence bounds. However, since we used flat prior distributions, the results would converge with those generated using traditional approaches; also, if a 95% credible interval excludes zero, then we can similarly interpret it as a plausibly positive (or negative) effect. In fact, the interpretation of a credible interval is quite straightforward: “Given our assumptions and the data, there 95% probability that the effect is between [lower interval] and [upper interval].” Also, instead of saying an effect is “statistically significant,” you could use the following phrases to better reflect our Bayesian inference approach’s focus on estimating probabilities of effects rather than null hypothesis testing:
“Results show credible evidence of…”
“The model indicates high probability of a positive effect…”
“We estimate with XX% probability that…”
“The 95% credibility interval excludes zero, suggesting a reliable effect…”
“The posterior distribution supports a meaningful difference between conditions…”
Now, how big are these effects - what do the beta estimates we just plotted mean substantivelly? Well, one benefit of the latent response expressed on a standard normal scale in a cumulative probit model is that the regression coefficients (betas) can be interpreted like a latent Cohen’s D. Hence, the beta coefficients themselves in these models are useful for visualizing effect estimates.
Yet, just how interpretable is a latent Cohen’s D? We moved from easily interpretable student proportions or percentages to some latent effect size. With ordinal (and many other) models, it can be even more helpful to estimate and visualize effects as contrasts on the predicted probability scale instead. This gets us back to talking about differences in terms of proportions or percentages of students predicted to give different response options across conditions. With ordinal models, we can estimate contrasts of these predicted probabilities separately for each ordinal response category. However, recall I suggested it might be helpful to simplify our estimand by estimating effects on the probability of reporting high comfort (“very” or “extremely” comfortable; y = 4 or y = 5). Let’s revisit this idea by working with the posterior outputs to specifically estimate the effects of class and lab on predicted probabilities of these specific responses.
6.6 Plot: Predicted probability scale effects (y>=4)
As before, I will start with one outcome (maggot) and then iterate across the remaining outcomes.
Show code
# Get posterior draws for maggot modelpost_draws <-as_draws_df(models[[2]]) # 2 for maggot# Create prediction gridnewdata <-expand_grid(survey01 =c(0, 1),lab01 =c(0, 1))# Calculate probabilities for each drawpreds <-posterior_epred(models[[2]], newdata = newdata)# Get P(response >= 4)p_high <-1- (preds[,,1] + preds[,,2] + preds[,,3])# Calculate contrasts (post - pre)contrasts_high <-data.frame(nolab = p_high[,2] - p_high[,1],lab = p_high[,4] - p_high[,3]) %>%pivot_longer(everything(), names_to ="condition", values_to ="diff") %>%group_by(condition) %>%summarize(estimate =median(diff),lower =quantile(diff, 0.025),upper =quantile(diff, 0.975) )# Plotggplot(contrasts_high, aes(x = estimate, y = condition, color = condition)) +geom_linerange(aes(xmin = lower, xmax = upper), linewidth =1) +geom_point(shape =124, size =4) +geom_vline(xintercept =0, linetype ="dashed") +scale_color_viridis_d(option ="D", begin =0, end =1) +labs(x ="Effect of Post vs Pre on P(response >= 4)",y ="") +scale_x_continuous(limits =c(-0.1, .6), breaks=seq(-.1, .6, by = .1)) +theme_minimal() +theme(legend.position ="none",panel.grid =element_blank(),axis.line =element_line(color ="black"),axis.ticks =element_line(color ="black") )
It appears the class without a lab increased the probability of reporting being “very” or “extremely” comfortable by approximately 7 percentage points (95% CrI ~ 0.0, 0.13), while the class with a lab increased by approximately 0.29 percentage points (95% CrI ~ 0.21, 0.37). I will pull the precise point and interval estimates into a table later. First, let’s iterate this plotting across all five outcomes.
6.6.1 FIGURE-2: Predicted probability scale (y>=4) contrasts, all outcomes
We can also pull the predicted probabilities used to generate these contrasts into a table, as well as pull the effect estimates (predicted probability contrasts) themselves into a table.
6.7 Tables: Predicted probabilities
First, the predicted probabilities used to generate effect contrasts.
6.7.1 TABLE-S1. Posterior pred. probs. by condition (y>=4)
Show code
# Function to get predicted probabilities for one outcomeget_pred_probs <-function(model, outcome_name) { newdata <-expand_grid(survey01 =c(0, 1),lab01 =c(0, 1) ) preds <-posterior_epred(model, newdata = newdata) p_high <-1- (preds[,,1] + preds[,,2] + preds[,,3])# Get probabilities for each conditiondata.frame(condition =c("No Lab Pre", "No Lab Post", "Lab Pre", "Lab Post"),estimate =colMeans(p_high),lower =apply(p_high, 2, quantile, 0.025),upper =apply(p_high, 2, quantile, 0.975) ) %>%mutate(outcome = outcome_name)}# Get probabilities for all outcomesall_probs <-map2_dfr(models, outcomes, get_pred_probs) %>%mutate(outcome =case_when( outcome =="insect"~"Insect Collection", outcome =="maggot"~"Maggot Collection", outcome =="fluid"~"Decomposition Fluid", outcome =="smell"~"Decomposition Smell", outcome =="death"~"Death" ) )# Create tableall_probs %>%gt(groupname_col ="outcome") %>%fmt_number(columns =c(estimate, lower, upper), decimals =3) %>%cols_merge(columns =c(lower, upper),pattern ="[{1}, {2}]" ) %>%cols_move(columns = lower, after = estimate) %>%cols_label( condition ~"Condition", estimate ~"P(response >= 4)", lower ~"95% CI" ) %>%tab_header(title =md("**Table S1. Predicted Probabilities of High Comfort Responses**") ) %>%opt_align_table_header(align ="left") %>%opt_vertical_padding(scale =0.5) %>%opt_table_font(font ="serif") %>%tab_style(style =cell_text(style ="italic"),locations =cells_row_groups() )
Table S1. Predicted Probabilities of High Comfort Responses
Condition
P(response >= 4)
95% CI
Insect Collection
No Lab Pre
0.191
[0.136, 0.255]
No Lab Post
0.306
[0.255, 0.360]
Lab Pre
0.253
[0.187, 0.325]
Lab Post
0.555
[0.497, 0.614]
Maggot Collection
No Lab Pre
0.181
[0.129, 0.241]
No Lab Post
0.248
[0.203, 0.296]
Lab Pre
0.205
[0.149, 0.268]
Lab Post
0.494
[0.436, 0.553]
Decomposition Fluid
No Lab Pre
0.197
[0.142, 0.261]
No Lab Post
0.239
[0.194, 0.287]
Lab Pre
0.229
[0.168, 0.296]
Lab Post
0.487
[0.429, 0.547]
Decomposition Smell
No Lab Pre
0.184
[0.131, 0.245]
No Lab Post
0.269
[0.224, 0.318]
Lab Pre
0.211
[0.152, 0.275]
Lab Post
0.490
[0.432, 0.549]
Death
No Lab Pre
0.385
[0.306, 0.465]
No Lab Post
0.384
[0.328, 0.442]
Lab Pre
0.420
[0.340, 0.502]
Lab Post
0.579
[0.521, 0.636]
Next, the the effect estimates or predicted probability contrasts (pre-post within “lab” and “no lab” conditions) themselves into a table.
Table S2. Estimated Changes in Probability of High Comfort Responses
Post-Pre differences in P(response >= 4)
Condition
Post-Pre Difference
95% CrI
Insect Collection
Lab
0.303
[0.216, 0.385]
No Lab
0.115
[0.039, 0.184]
Maggot Collection
Lab
0.290
[0.209, 0.366]
No Lab
0.067
[−0.001, 0.129]
Decomposition Fluid
Lab
0.259
[0.172, 0.339]
No Lab
0.042
[−0.028, 0.108]
Decomposition Smell
Lab
0.280
[0.200, 0.357]
No Lab
0.085
[0.014, 0.152]
Death
Lab
0.160
[0.061, 0.256]
No Lab
−0.002
[−0.095, 0.093]
These are the estimates and 95% credibility intervals plotted in the figure above.
Overall, results indicate that both classroom experience and lab participation increased students’ comfort with forensic activities, with stronger effects observed in the lab condition. For example, the probability of reporting high comfort (4 or higher on the 5-point scale) with insect collection increased by 0.30 [95% CrI: 0.22, 0.39] in the lab condition compared to 0.12 [0.04, 0.18] in the no-lab condition. Similar patterns emerged across all activities, with lab participation consistently showing larger comfort gains. The strongest effects appeared for activities involving insects and maggots, while effects on death-related comfort were more modest. Notably, some activities showed minimal changes in the no-lab condition - for example, decomposition fluid comfort increased by only 0.04 [-0.03, 0.11] without lab experience. For death-related comfort, the no-lab condition showed essentially no change (0.00 [-0.10, 0.09]), while lab participation produced a credible increase of 0.16 [0.06, 0.26].
6.8 Plot: Contrasts & probs, maggot only
The results above are simplified to compare predicted probabilities and probability contrasts (for P[y>=4]) fore each outcome. It might help to step back and see where these are coming from by calculating and plotting predicted probabilities and contrasts for all five outcome response values. We will do this for one outcome - in this case, maggots as before.
First, let’s extract and then check the pre and post posterior predictions from the maggot model for the “no lab” and “lab” conditions.
Show code
# For maggot model, first get predictionsnd <-tibble(survey01 =c(0, 1),lab01 =0# No-lab condition)# Get full posterior predictionsnolab_preds <- nd %>%add_epred_draws(models[[2]]) %>%ungroup()# Calculate predicted probabilities pred_probs <- nolab_preds %>%group_by(survey01, .category) %>%summarise(p =mean(.epred),ll =quantile(.epred, .025),ul =quantile(.epred, .975) ) pred_probs %>%gt()
.category
p
ll
ul
0
1
0.321
0.248
0.396
2
0.235
0.200
0.270
3
0.263
0.224
0.306
4
0.124
0.090
0.160
5
0.058
0.035
0.087
1
1
0.290
0.221
0.364
2
0.231
0.196
0.267
3
0.274
0.235
0.314
4
0.137
0.102
0.176
5
0.068
0.042
0.102
Looks good. Now, let’s calculate and check the post-pre contrasts for “no lab” condition from the maggot model.
What if we assume responses to each outcome represent indicators of a single underlying latent comfort with forensic experiences? If so, we could presumably “stack” the probabilities from all models together. Let’s see what happens.
First, you might be able to get an intuitive sense of what I am trying to do here if I first show you the estimates from all five outcome models individually (but overlapping on plots).
Note how each individual (overlapping) model interval estimate is a little different, but patterns are quite similar overall. One thing that becomes clear here is that uncertainty is reduced in the “post” survey among those with a lab component - irrespective of specific outcome, we are quite certain that less than 10% who experienced the class with a lab will report being “not comfortable” at the end of that class.
Now, let’s ignore fact that estimates are generated by models of different outcomes and use the full posterior distribution from all five models to generate a single interval probability estimate for each response category that effectively represents our best inference about the probabilities of underlying “latent” comfort levels.
In essence, for each response category, we are not entirely certain where the underlying latent probability is - and with multiple interval estimates (one from each of 5 models), there should be more uncertainty than one would find from any individual model…
In the plots above, I “stacked” the posterior predicted probabilities, which essentially treats each of the various outcomes as equal indicators of a latent “comfort” construct by using them individually to estimate and then aggregating their predicted probabilities and contrasts by condition. Since this approach ignores outcome-specific differences in comfort levels (e.g., fewer students are “not comfortable” with death than with other outcomes), it produces estimated with substantially greater uncertainty (i.e., wider credible intervals) compared to a comparable plot for a single outcome (e.g., the maggot plot above). Yet, even with this increased uncertainty, it is clear that the lab experience substantially reduces the predicted probability of reporting low latent comfort (“not at all” = 1; “somewhat” = 2) and substantially increases the probability of reporting high latent comfort (“very” = 4; “extremely” = 5).
Though this might be somewhat intuitive, it is not a traditional or principled approach to measuring or modeling latent constructs. We can extend on this approach though by building a multilevel model where the outcome is a latent hierarchical variable modeled as a function of the five outcomes and then predicted by survey01*lab01. Let’s try it.
6.9.3 Multilevel model of effects on latent comfort
First, I will fit a hierarchical model with varying slopes.
Show code
# Data prep - needs to be in long format for hierarchical modeldat_long <- dat_mod %>%pivot_longer(c(insect, maggot, fluid, smell, death),names_to ="outcome",values_to ="comfort" )# Formula for hierarchical ordinal modelbf_comfort <-bf(comfort |thres(4) ~ survey01 * lab01 + (survey01 * lab01 | outcome))# Priors (same thresholds/effects as before)priors3 <-c(prior(normal(-0.84, 1), class ="Intercept", coef ="1"),prior(normal(-0.25, 1), class ="Intercept", coef ="2"),prior(normal(0.25, 1), class ="Intercept", coef ="3"),prior(normal(0.84, 1), class ="Intercept", coef ="4"),prior(normal(0, 1), class ="b"),# Add prior for random effect SDprior(exponential(1), class ="sd"))# Fit hierarchical modelhier_mod <-brm( bf_comfort,data = dat_long,family =cumulative("probit"),prior = priors3,cores =4,chains =4,iter =4000,warmup =2000,seed =1138,control =list(adapt_delta =0.95), # Increased from default 0.8file ="Models/hier_comfort_fit")
There were a few divergent transitions, but that is not too surprising. We can revisit our priors and do some detailed checks later. For now, let’s see if we can extract predictions and plot.
Specifically, I will extract population-level predictions using re_formula = NA to get predictions that marginalize over the random effects of outcomes, effectively giving us estimates for the latent comfort construct. Then, I calculate post-pre contrasts and posterior probabilities for each condition.
This plot is pretty similar to the one generated by the less conventional stacking approach, but it is generated using a more principled multilevel modeling approach. It efficiently shows the overall effects of classroom experiences on students’ latent comfort levels with forensics aspects.
Let’s pull the plotted estimates into tables to help with reporting in the text.