install.packages('PanelMatch')
PanelMatch
R package for PanelMatch DiD estimator (Imai, Kim, and Wang 2023)
PanelMatch is a package that implements the PanelMatch estimator by Imai, Kim, and Wang (2023) that solves the bias of the TWFE estimator in staggered and non-absorbing DiD. Documentation can be found here.
Install the package as follows:
sample code
Start by loading packages and the data:
library(PanelMatch)
library(readr) # for importing data
library(ggplot2) # for plotting
= read_csv('df.csv') df
PanelMatch requires us to pre-process the data with the PanelData()
function:
Note: id
variable should be transformed to an integer variable before starting.
# PanelMatch dislikes tidyverse df's, so do this:
= df |> as.data.frame()
df
= PanelData(
df.panel panel.data = df, # your data
unit.id = "id", # your unit var (integer only)
time.id = "time", # your time period var (integer only)
treatment = "treat", # your treatment var
outcome = "outcome" # your outcome var
)
Now, we can run the PanelMatch matching process with PanelMatch()
to match based on lag-period pre-history:
Lag refers to periods before the treatment in which to match on. Leads refer to periods after the treatment on which to estimate.
= PanelMatch(
match lag = 3, # number of pre-periods to match treat history
panel.data = df.panel, # PanelData generated data
lead = c(0:6), # how many post-treat dynamic effects to estimate
qoi = "att",
refinement.method = "mahalanobis", # set to "none" if no covariates
match.missing = T,
covs.formula = ~ covar, # (optional, can exclude)
placebo.test = T # (optional, but may cause issues)
)
To aggregate all the matched comparisons into a singular ATT, we use the PanelEstimate()
function.
|>
match PanelEstimate(
panel.data = df.panel, # PanelData object
pooled = T, # tells R to calculate ATT
moderator = NULL # optional. character string for var to calculate heterogenous effects
|>
) print()
#> Point estimates:
#> [1] 1.563779
#>
#> Standard errors:
#> [1] 1.189761
#>
#> Estimates produced with 5 observations (non-empty matched sets)
The Point estimates
are the estimated ATT, and the standard errors are provided.
We can estimate event-study effects with the PanelEstimate()
function for post-treatment effects, and the placebo_test()
function for pre-treatment effects.
# estimate post-treatment effects
= match |>
post PanelEstimate(
panel.data = df.panel, # PanelData object
pooled = F # tells R to calculate dynamic effects
)
# estimate pre-treatment effects
= match |>
pre placebo_test(
panel.data = df.panel, # PanelData object
lag.in = 3, # should equal lag in PanelMatch()
plot = F
)
The built in plotting functions are lackluster, so we will create a manual ggplot to plot the results.
# combine pre and post estimates
= c(pre$estimate, 0, post$estimate) # there is no t=-1 effect estimated so add it
effects = c(pre$standard.error, 0, post$standard.error)
se
# create rel.time variable for pre/post periods
= c(-3:6)
rel.time
# create results df
= data.frame(rel.time, effects, se)
results.df
# first create lwr and upr bounds for se
$se.lwr = results.df$effects - 1.96*results.df$se
results.df$se.upr = results.df$effects + 1.96*results.df$se
results.df
|>
results.df ggplot(aes(x = rel.time, y = effects)) +
geom_point() +
geom_linerange(aes(ymin = se.lwr, ymax = se.upr)) +
geom_hline(yintercept = 0, color = "gray") +
geom_vline(xintercept = -0.5, color = "gray") +
labs(title = "Event-Study Estimates (PanelMatch)") +
xlab("Time to initial treatment period") +
ylab("Estimate") +
theme_light()