Show code
library(tidyverse)
library(ggplot2)
library(patchwork)
library(psych)
# library(devtools)
# install_github("jtextor/dagitty/r")
library(dagitty)
library(CMAverse)
Jon Brauer
May 13, 2024
At times, we find ourselves wanting to publish a brief or “quick hitting” entry with a brief comment or coding example but do not have the time for a formal blog post. So, we decided to pilot a new type of entry. For lack of a better term, we’ll call this a “Reluctantly Rapid Response” (RRR for short). We are generally reluctant to post anything and doubly reluctant to post anything rapidly, so the name seems fitting to us.
This RRR entry provides code and output from a simple simulation motivated by Brier Gallihugh’s question on X about a mediation issue.
The following simulation, plots, and analysis provide one example situation in which changes in a mediator (ChgM) are positively correlated with changes in an outcome (ChgY) within two groups A & B, yet Group A membership is estimated as having a negative indirect effect on changes in the outcome (ChgY) via changes in the mediator (ChgM). In this example, the total negative indirect effect is generated by a small negative “pure” indirect (PNIE) effect combined with a small “mediated interaction” (INTmed) effect.
I am not claiming that this is what is happening in the original poster’s (BG’s) data. In fact, one apparent and important difference is that BG stated in a later tweet that “Group A shows higher rates of their mediator compared to Group B AND lower rates of the outcome compared to Group B as well.”
If “rates” refer to initial status at Time 1 (not changes), then perhaps we could generate comparable data. Yet, if “rates” refers to the change scores themselves, then it seems we may still have an unsolved problem on our hands. In the simulation below, changes in the mediator are positively correlated with changes in the outcome in both Groups A and B, and Group A has lower outcome change scores on average; however, specification of marginally lower mediator change scores on average for Group A contribute to the negative indirect effect estimates.
Please feel free to reach out on X or via email if you see errors or have other suggestions!
# A tibble: 1,000 × 3
GroupA ChgM ChgY
<int> <dbl> <dbl>
1 0 0.358 0.687
2 0 0.490 2.07
3 0 0.115 0.895
4 1 0.451 1.65
5 1 0.588 1.31
6 0 0.961 0.371
7 1 0.732 1.51
8 1 0.294 0.344
9 0 0.249 0.0797
10 1 2.42 2.25
# ℹ 990 more rows
GroupAplot < simdata %>% ggplot() +
geom_bar(aes(x=GroupA),
color="#1fa187", fill="#4ac16d", alpha=.7,
position="identity") +
theme_minimal() +
theme(axis.text.y=element_blank(),
axis.ticks.y=element_blank()
) +
labs(x="GroupA", y=NULL)
ChgMplot < simdata %>% ggplot() +
geom_histogram(aes(x=ChgM, y=..density..),
color="#1fa187", fill="#4ac16d", alpha=.7,
position="identity",
breaks = seq(4, 4, by = 1)) +
theme_minimal() +
theme(axis.text.y=element_blank(),
axis.ticks.y=element_blank()
) +
labs(x="ChgM", y=NULL)
ChgYplot < simdata %>% ggplot() +
geom_histogram(aes(x=ChgY, y=..density..),
color="#1fa187", fill="#4ac16d", alpha=.7,
position="identity") +
theme_minimal() +
theme(axis.text.y=element_blank(),
axis.ticks.y=element_blank()
) +
labs(x="ChgY", y=NULL)
GroupAplot + ChgMplot + ChgYplot
Positive correlation between ChgM and ChgY in full sample.
pairs.panels(simdata,
smooth = FALSE, # If TRUE, draws loess smooths
scale = FALSE, # If TRUE, scales the correlation text font
density = TRUE, # If TRUE, adds density plots and histograms
ellipses = FALSE, # If TRUE, draws ellipses
method = "pearson", # Correlation method (also "spearman" or "kendall")
pch = 21, # pch symbol
lm = TRUE, # If TRUE, plots linear fit rather than the LOESS (smoothed) fit
cor = TRUE, # If TRUE, reports correlations
jiggle = FALSE, # If TRUE, data points are jittered
factor = 2, # Jittering factor
hist.col = 3, # Histograms color
stars = TRUE, # If TRUE, adds significance level with stars
ci = TRUE) # If TRUE, adds confidence intervals
Positive correlation between ChgM and ChgY within Group A.
Positive correlation between ChgM and ChgY within Group B.
Notice inconsistent or heterogeneous effect of ChgM on ChgY across Groups.
Negative indirect effect of GroupA on ChgY via ChgM.
Mediation/Moderation Analysis
Call: mediate(y = ChgY ~ GroupA + (ChgM), data = simdata, n.iter = 10000)
The DV (Y) was ChgY . The IV (X) was GroupA . The mediating variable(s) = ChgM .
Total effect(c) of GroupA on ChgY = 0.62 S.E. = 0.08 t = 8.07 df= 998 with p = 2e15
Direct effect (c') of GroupA on ChgY removing ChgM = 0.53 S.E. = 0.06 t = 8.47 df= 997 with p = 8.5e17
Indirect effect (ab) of GroupA on ChgY through ChgM = 0.08
Mean bootstrapped indirect effect = 0.08 with standard error = 0.04 Lower CI = 0.17 Upper CI = 0
R = 0.61 R2 = 0.37 F = 289.17 on 2 and 997 DF pvalue: 5.13e135
To see the longer output, specify short = FALSE in the print statement or ask for the summary
Total negative indirect effect (PNIE + INTmed) estimate is larger after accounting for exposuremediator interaction.

  0%

====  5%

=======  10%

==========  15%

==============  20%

==================  25%

=====================  30%

========================  35%

============================  40%

================================  45%

===================================  50%

======================================  55%

==========================================  60%

==============================================  65%

=================================================  70%

====================================================  75%

========================================================  80%

============================================================  85%

===============================================================  90%

==================================================================  95%

====================================================================== 100%
Causal Mediation Analysis
# Outcome regression:
Call:
glm(formula = ChgY ~ GroupA + ChgM + GroupA * ChgM, family = gaussian(),
data = getCall(x$reg.output$yreg)$data, weights = getCall(x$reg.output$yreg)$weights)
Coefficients:
Estimate Std. Error t value Pr(>t)
(Intercept) 0.03308 0.04337 0.763 0.446
GroupA 0.51010 0.06229 8.190 7.97e16 ***
ChgM 0.53021 0.04329 12.248 < 2e16 ***
GroupA:ChgM 0.34764 0.06259 5.554 3.58e08 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 0.9610836)
Null deviance: 1559.35 on 999 degrees of freedom
Residual deviance: 957.24 on 996 degrees of freedom
AIC: 2804.2
Number of Fisher Scoring iterations: 2
# Mediator regressions:
Call:
glm(formula = ChgM ~ GroupA, family = gaussian(), data = getCall(x$reg.output$mreg[[1L]])$data,
weights = getCall(x$reg.output$mreg[[1L]])$weights)
Coefficients:
Estimate Std. Error t value Pr(>t)
(Intercept) 0.006556 0.043905 0.149 0.8813
GroupA 0.120841 0.062786 1.925 0.0546 .

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 0.9850333)
Null deviance: 986.71 on 999 degrees of freedom
Residual deviance: 983.06 on 998 degrees of freedom
AIC: 2826.8
Number of Fisher Scoring iterations: 2
# Effect decomposition on the mean difference scale via the regressionbased approach
Direct counterfactual imputation estimation with
bootstrap standard errors, percentile confidence intervals and pvalues
Estimate Std.error 95% CIL 95% CIU P.val
cde 0.510101 0.073753 0.651063 0.410 <2e16 ***
pnde 0.514441 0.074196 0.638106 0.410 <2e16 ***
tnde 0.556449 0.076406 0.692366 0.431 <2e16 ***
pnie 0.064071 0.034397 0.126845 0.011 <2e16 ***
tnie 0.106080 0.056820 0.194148 0.019 <2e16 ***
te 0.620520 0.089392 0.778236 0.465 <2e16 ***
intref 0.004339 0.015854 0.029367 0.025 0.8
intmed 0.042009 0.025189 0.095535 0.008 <2e16 ***
cde(prop) 0.822054 0.068527 0.730253 0.935 <2e16 ***
intref(prop) 0.006993 0.025298 0.042487 0.047 0.8
intmed(prop) 0.067700 0.035646 0.017009 0.143 <2e16 ***
pnie(prop) 0.103254 0.047652 0.021586 0.183 <2e16 ***
pm 0.170953 0.078667 0.038596 0.283 <2e16 ***
int 0.074693 0.031196 0.022878 0.120 <2e16 ***
pe 0.177946 0.068527 0.064943 0.270 <2e16 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(cde: controlled direct effect; pnde: pure natural direct effect; tnde: total natural direct effect; pnie: pure natural indirect effect; tnie: total natural indirect effect; te: total effect; intref: reference interaction; intmed: mediated interaction; cde(prop): proportion cde; intref(prop): proportion intref; intmed(prop): proportion intmed; pnie(prop): proportion pnie; pm: overall proportion mediated; int: overall proportion attributable to interaction; pe: overall proportion eliminated)
Relevant variable values:
$a
[1] 1
$astar
[1] 0
$mval
$mval[[1]]
[1] 0
@online{brauer2024,
author = {Brauer, Jon},
title = {RRR: {Mediation} {Makes} {My} {Head} {Hurt!}},
date = {20240513},
url = {https://reluctantcriminologists.com/blogposts/[4]/causationnocorr.html},
langid = {en}
}