POL269 Political Research
2024-08-04
The dataset comes from a randomized experiment conducted in Tennessee, where students were randomly assigned to attend either a small class or a regular-size class from kindergarten until 3rd grade
Given that we are analyzing experimental data, what do we need to compute to estimate the average causal effect? __________________
Although, we could compute it directly let’s compute it by fitting a linear model so that ______ is equivalent to it
Open RStudio
Open exercise_5.R from within RStudio
Run step 0
setwd()
for your computer classtype reading math graduated small
1 small 578 610 1 1
2 regular 612 612 1 0
3 regular 583 606 1 0
4 small 661 648 1 1
5 small 614 636 1 1
6 regular 610 603 0 0
Fit a linear model so that the estimated slope coefficient is equivalent to the difference-in-means estimator. In this case, the fitted line is: \(\widehat{\textrm{\textit{graduated}}} = \widehat{\alpha} + \widehat{\beta} \,\textrm{small}\)
Store the fitted model in an object called fit and then ask R to provide the contents of fit
R code to fit and store linear model?
##
## Call:
## lm(formula = graduated ~ small, data = star)
##
## Coefficients:
## (Intercept) small
## 0.866473 0.007031
\(\widehat{\beta}\) = 0.007
Direction, size, and unit of measurement of the effect?
What’s the estimated average treatment effect? (Make sure to mention all the key elements: the assumption, why the assumption is reasonable, the treatment, the outcome, as well as the direction, size, and unit of measurement of the average treatment effect)
Answer: Assuming that students who attended a small class were comparable to students who attended a regular-size class (a reasonable assumption because the data come from a randomized experiment), we estimate that attending a small class increases the probability of graduating from high school by about 0.7 percentage points, on average
Why?
That is, is the average treatment effect distinguishable from zero at the population level, statistically speaking?
Specify null and alternative hypotheses
summary()
where we specify inside the parentheses the name of the object where we stored the output of the lm()
function
Call:
lm(formula = graduated ~ small, data = star)
Residuals:
Min 1Q Median 3Q Max
-0.8735 0.1265 0.1265 0.1335 0.1335
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.866473 0.012834 67.514 <2e-16 ***
small 0.007031 0.018940 0.371 0.711
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.3369 on 1272 degrees of freedom
Multiple R-squared: 0.0001083, Adjusted R-squared: -0.0006777
F-statistic: 0.1378 on 1 and 1272 DF, p-value: 0.7105
Do we reject or fail to reject the null hypothesis? We fail to reject the null hypothesis because …
Is the effect statistically significant at the 5% level?
Answer: No, the effect is not statistically significant at the 5% level. We do not have enough evidence to state that attending a small class is likely to have a non-zero average causal effect on the probability of graduating from high school, at the population level.
How strong is the internal validity of this study? Have the researchers accurately measured the average causal effect on the sample of students who were part of the study?
Answer: Yes, we can interpret the effect as causal. The internal validity of this study is strong because the treatment (attending a small class) was assigned at random. Random treatment assignment should have eliminated all confounding variables. Students that were assigned to attend a small class should be comparable to students that were assigned to attend a regular-size class.
How strong is the external validity of this study? To what population can the findings be generalized to?
Answer: Given the characteristics of the study, only students from large schools in Tennessee were able to participate in the experiment. As a result, the sample of participating students was not perfectly representative of all students in Tennessee or of all students in the U.S. Consequently, we can conclude that, although we do get to observe the treatment of interest in the real world, the analysis has relatively weak external validity, especially if one wishes to generalize the study’s conclusions to all schools and students in Tennessee or in the entire United States. (See subsection 5.5.4 of the book.)
POL269