Untangling Cause and Effect Without Experiments

The following is a piece I wrote for the LMH News, based on a general interest webinar that I gave in November of 2020. If this post inspires you to learn more about causal inference, you may enjoy browsing my teaching materials on treatment effects.

Will earning a PPE degree from Oxford increase your lifetime earnings? Does eating bacon sandwiches cause cancer? Does watching Fox News make you vote Republican? Will owning a dog increase your lifespan? Each of these questions concerns the causal effect of a treatment on an outcome. In social science, a “treatment” is any factor whose causal effect we hope to learn. As far as I know, there has never been an experiment that compelled people to study a particular subject at university, watch Fox News, or own a dog: nonetheless, papers have been written and published that use data to estimate the causal effects of each of these treatments. Datasets in which the treatment of interest is “naturally occurring” rather than randomly assigned as part of an experiment are called observational. Many of the most interesting and important treatments in social science cannot be randomly assigned. Social scientists have therefore developed a set of tools for studying treatment effects using observational data. By introducing you to some of these tools and briefly summarising the ways in which researchers have used them, I’ll shed some light on that age-old question: how much is your education worth?

Alice read PPE at Oxford and currently earns £75,000 a year. Would she have earned as much if she had studied at Oxford Brookes instead? The fundamental problem of causal inference is that we can never observe a person’s counterfactual outcome. In other words, we can never know what her outcome would have been if her treatment had been different. A counterfactual is fundamentally a “within-person” comparison, asking us to imagine two parallel universes, one in which Alice attends Oxford and another in which she attends Brookes. The causal question of interest is how much the Alice in our world earns compared to the Alice who resides through the looking glass. Of course, this comparison can never be more than a thought experiment. To learn about treatment effects in the real world, we develop methods and assumptions that allow us to substitute the idealized within-person comparison with a between-person comparison.

According to recent data from Department for Education, UCAS and the ONS, the median salary of Oxford graduates is nearly £15,000 higher than that of Brookes graduates.1 Does this mean that the treatment effect of attending Oxford rather than Brookes is £15,000 a year? Almost certainly not! This is not an apples-to-apples comparison. One of the crucial differences between the two universities is entry requirements: Oxford requires A*AA for Economics and Management applicants, whereas Brookes asks for BCC for a similar degree. Oxford students on average have higher levels of academic preparation and ability upon entering university: accordingly, it’s possible that attending Oxford has no causal effect on wage, but earning high grades at A level does. In statistical parlance, we would say that ability confounds the relationship between university attended and wage.

So how can we solve the problem of confounding in observational datasets? One approach is matching, which compares treated and untreated people with the same values of any confounders. For example, we might compare Oxford Economics students with three A-stars at A-level to Brookes Economics students with the same A level results. Repeating this for every combination of subject and A-levels and averaging the results gives an estimate of the overall causal effect of attending Oxford. A recent report from the IFS used a closely related approach to estimate the relative returns to different undergraduate degrees in the UK.2 Their findings suggest that confounding is a very serious problem when comparing raw wages of students across universities. For example, women who graduate from LSE earn over 70% more than the average female graduate. After adjusting for differences in student characteristics, however, this wage premium falls dramatically: female graduates of LSE earn only a little over 35% more than similar women who attended different universities. The same story applies to other elite UK institutions such as Oxford, Cambridge, and UCL.

For matching methods to be effective, we need to observe all important confounders. In some settings this is a reasonable assumption, but in others it clearly isn’t. For this reason, researchers have developed a number of techniques to address the problem of unobserved confounding. Much of my own research focuses on the use of so-called “instrumental variables.” An instrumental variable, or instrument for short, is something that affects the treatment of interest but is unrelated to any unobserved confounders. To understand this idea, we’ll examine one of the most famous papers to use the instrumental variables approach: a 1991 article by Josh Angrist and Alan Krueger studying the impact of compulsory school attendance on later-life earnings.3 The paper begins with a striking observation: in the US, people born in the first quarter of the year tend to complete fewer years of education. Why might this be the case? According to Angrist and Krueger: “children born in different months of the year start school at different ages, while compulsory schooling laws generally require students to remain in school until their sixteenth or seventeenth birthday. In effect, the interaction of school-entry requirements and compulsory schooling laws compels students born in certain months to attend school longer than students born in other months.”

Angrist and Krueger use quarter of birth as an instrumental variable to estimate the causal effect of schooling on wage. Quarter of birth is indeed related to the treatment of interest, years of schooling. But there are many unobserved factors that influence both how many years of education a person attains, and her later-life outcomes: demographics, family background etc. Is quarter of birth unrelated to these? Angrist and Krueger argue in the affirmative: “one’s birthday is unlikely to be correlated with personal attributes other than age at school entry.” If this is correct, then we can estimate the causal effect of education on wages as follows. First we calculate the difference of wages between men born in the first quarter and those born in the rest of the year. Those born in the first quarter earn less on average, so this difference is negative. Next we calculate the corresponding difference in years of education for these two groups. Those born in the first quarter have fewer years of education on average, so this difference is also negative. The ratio of the two differences tells us the fraction of the observed difference in wages that is caused by differences in education. Since both differences are negative, the ratio is positive. Angrist and Krueger find that an extra year of education causes between a 5% and 15% increase in wages.

But is it really true that a person’s birthday is uncorrelated with “personal attributes other than age at school entry?” About seven years ago, Buckles and Hungerman revisited this question, examining US data that includes information on both birth dates and family background4. In the years since Angrist and Krueger published their original paper, there have been more than 20 other published papers using season of birth as an instrumental variable. Across these studies, US children born in the first quarter—or more generally in the winter months—earn less, pursue less education, and have lower measured intelligence on average compared those born in other parts of the year. At the same time, researchers have found a correlation between season of birth and schizophrenia, autism, dyslexia, extreme shyness, and even suicide risk.

What’s going on here? Buckles and Hungerman propose a simple explanation: “children born in different seasons are not initially similar but rather are conceived by different groups of women.” Mothers who give birth in the winter months are disproportionately likely to be teenagers. They are also less educated, and less likely to be married. Buckles and Hungerman conclude that: “The well-known relationship between season of birth and later outcomes is largely driven by differences in fertility patterns across socioeconomic groups, and not merely natural phenomena or schooling laws that intervene after conception.” In other words, quarter of birth is indeed related to confounders that were unobserved by Angrist and Krueger in their original paper.

So where does all of this leave us? Untangling cause and effect is extremely challenging, and always relies upon assumptions. Social scientists have a powerful toolbox for studying treatment effects in settings where randomized experimentation is impossible, impractical, or unethical. But like any tools, matching, instrumental variables, and related methods depend for their success on the care with which they are used. We can indeed learn about cause-and-effect from observational data, but doing so requires knowledge of the problem we’re studying, a willingness to question our assumptions, and some good old-fashioned intellectual humility.