The era of causality and the decline of correlation
Attribution models and causal inference [2/5]
Does X cause Y? If X causes Y, how large is the effect of X on Y? Is the size of this effect larger relative to the effects of other causes of Y? These questions are solved by the empirical work of the social sciences, with scientific and statistical methods of causality.
To understand the processes behind what appears to be “getting a rabbit out of the hat,” it is necessary to create clear methodological foundations around the broad world of causal inference.
An interest in causes and explanations permeates our lives. We wonder why the car won’t start, why corn grows better in one field than another, why a friend seemed particularly happy or gloomy yesterday. Scientists wonder why elementary particles have the mass they have, why there are so many non coding regions in the human genome, or why dinosaurs went extinct.
Given the centrality of these interests, it is not surprising that there are many attempts to theorize about causation and explanation, both within and outside of philosophy. The philosophical concern with these issues goes back to Plato and Aristotle, as David Rubén reminds us.
Claims about causality play a central role in the doctrines of virtually all philosophers, from Descartes and Locke to Hume and Kant. More recently, the development of Carl Hempel’s deductive-nomological (DN) model of explanation and the elaboration of detailed alternatives to this model, by writers such as Wesley Salmon and Philip Kitcher, who have made explanation a central theme in the philosophy of The science.
Outside of philosophy, one finds fewer conscious theories about explanation, but there are extensive literatures in statistics, econometrics, cognitive psychology, and computer science on problems of causal inference and the best way to understand the notion of causality.
Despite this discussion, it is fair to say that there is less consensus on the issues of explanation and causation in philosophy than three or four decades ago, when the DN model was widely accepted. In fact, as in other parts of philosophy, the last decades of work on causation and explanation have been characterized by a proliferation of charter schools with surprising influence.
Consider, for example, the question of the role of counterfactuals in causality characterization and explanation. Although the counterfactual analyzes of causality developed by David Lewis and his students (Lewis, 1973,  1986c, 2000) have been influential in some areas of philosophy such as metaphysics; they have had relatively little impact on the philosophy of science.
Furthermore, the Lewisian tradition has ignored related work in statistics and econometrics that is also based on ideas about the connection between causality and counterfactuals.
Many philosophers of science, in turn, have dismissed causal and explanatory treatments that rely on counterfactuals as unclear or unscientific, despite the existence of a mathematically sophisticated literature, outside of philosophy taking this form.
Probabilistic theories of causality
Within the philosophy of science, we find writers working on what they call probabilistic theories of causality. Writers who think causation involves the transmission of a certain physical quantity as energy, and writers who propose to analyze the notion of causality in terms of a law of nature, again with relatively little interference. This entire discussion has had a surprisingly small impact on philosophers who are not causal / explanatory specialists, but who draw on ideas on these issues in their own work.
In statistics and econometrics we find a very parallel distinction between “descriptive statistics”, which include, on the one hand, information on correlations and information on causal and explanatory relationships.
Problems involving “inductive” inference of correlations in samples with population correlations are considered very different from problems of causal inference. One assumption is that an adequate theory of causation and explanation should make such distinctions meaningful, should clarify how causal and explanatory information differs from mere description.
Some writers argue that any explanation (or at least of why some result occurs) must be causal, and other writers deny it, arguing instead that there are non-causal forms of (why) explanation. Writers also differ about what counts as a “causal explanation”, whereby Wesley Salmon (1984) adopts a notion of causal explanation whereby this involves tracing causal processes and temporally continuous space intersections of such processes, and also he argues that any genuine explanation must be causal in this regard.
According to Salmon, an account that tracks the subsequent motion of two billiard balls to their previous collision would count as a causal explanation, while a derivation of the equilibrium pressure of a gas from the ideal gas law and initial conditions. above, it would not count as explanatory, because it fails to trace individual causal processes.
Graham Nerlich (1979), by contrast, is in rough agreement with Salmon about what counts as a causal explanation, but maintains that there is an important form of non-causal explanation, which he calls geometric explanation. He offers as an example the explanation of the trajectories of free particles in the gravitational field, referring to the related structure of space-time. Salmon, presumably, would deny that such appeals to the structure of space-time are explanatory.
Another distinction between causal and non-causal forms of explanation is due to Elliott Sober (1983); he contrasts explanations that trace the actual sequence of events leading to some result, which he regards as causal, with what he calls equilibrium explanations, in which one result is explained by showing that a large number of initial states of an evolved system, such that it ends in the outcome state that we wish to explain, but in which no attempt is made to trace the actual sequence of events that lead to that outcome.
Therefore, an explanation that tracks the actual sequence of molecular collisions that lead to the current thermodynamic state of a gas, as characterized by macroscopic variables such as temperature and pressure, counts as a causal explanation, while a demonstration that almost all molecular configurations are compatible with the initial temperature and pressure of the gas, would result in its current macroscopic state, counting as an explanation of non-causal equilibrium.
It is necessary to have a broad notion of causal explanation, according to which any explanation that develops, showing how a result depends on other variables or factors, counts as causal.
The distinctive feature of causal explanations, thus conceived, are explanations that provide information that is potentially relevant to manipulation and control: they tell us how, if we could change the value of one or more variables, we could change the value of other variables. According to this conception, both derivations involving the ideal gas law and Sober’s equilibrium explanations count as causal explanations.
This “manipulative” conception of causal explanation has the advantage of fitting into a wide range of scientific contexts, especially in the behavioral and social sciences, where researchers think that they themselves discover causal relationships and construct causal explanations, but where there are notions narrower causal explanations like Salmon’s.
Formal and statistical principles
Obtaining strong causal inferences from observational data is a central goal in the social sciences. Technical approaches based on statistical models (graphical models, nonparametric structural equation models, estimators of instrumental variables, hierarchical Bayesian models, etc.) abound.
It has long been argued that these methods are unreliable, some have even repeatedly shown that it is better to rely on subject matter experience, exploit natural variation to mitigate confusion, and rule out competing explanations.
This statement causes a lot of skepticism, it is hard to believe that a probabilistic statistician and mathematician favors the “low technology” approaches. But the tide is turning. An increasing number of social scientists agree that statistical technique cannot replace good research design and subject knowledge. This view is particularly common among those who understand mathematics and have field experience.
The rabbit in the hat
Historically, the “epidemiology of shoe skin” is summed up in an intensive door-to-door study that wears out researchers’ shoes. In contrast, proponents of statistical models sometimes claim that their methods can save poor research design or low-quality data.
Some suggest that their algorithms are general-purpose inference engines: they enter data, change course, quantitative causal relationships emerge, and knowledge of the topic is not required. This is equivalent to taking a rabbit out of a hat. Freedman’s principle of conserving rabbits says that:
“To remove a rabbit from a hat, you must first place a rabbit in the hat.”. In the statistical model, assumptions put the rabbit in the hat.
Modeling assumptions are made primarily for mathematical convenience, not plausibility. The assumptions can be true or false, generally false. When the assumptions are true, the theorems about the methods hold, when the assumptions are false, the theorems do not apply.
If so, how well do the methods behave? When the assumptions are “a little wrong,” are the results “a little wrong”? Can assumptions be empirically tested? Do they violate common sense? Freedman asked and answered these questions, over and over. He demonstrated that scientific problems cannot be solved with “one size fits all” methods.
Rather, they require fine leather footwear: careful empirical work tailored to the topic and the research question, informed by both knowledge of the topic and statistical principles.
However, mechanical rules cannot be established for the activity. Since Hume, that’s almost a topic, instead, causal inference seems to require a huge investment of skill, intelligence, and hard work. Many converging lines of evidence must be developed. Natural variation needs to be identified and exploited, data must be collected, confounders must be considered, and alternative explanations must be thoroughly tested.
What is the correct question?
First of all, the correct question needs to be framed. Naturally, there is a desire to substitute intellectual capital for labor, which is why researchers try to base causal inference on statistical models. The technology is relatively easy to use and promises to open up a wide variety of questions to the research effort, and the models themselves require critical scrutiny.
Mathematical equations are used to adjust for confusion and other sources of bias, and these equations may seem formidably accurate, but typically derive from many somewhat arbitrary choices.
What variables to introduce in the regression? What functional form to use? What assumptions to make about the parameters and the error terms? These choices are seldom dictated by data or prior scientific knowledge, which is why judgment is so critical, the opportunity for error so great, and the number of successful applications so limited.
From observation to experiments
The causal inference from randomized controlled experiments using the intention principle is not controversial, as long as the inference is based on the underlying probability model in the randomization. But some scientists ignore the design and instead use regression to analyze data from random experiments.
To assess how close an observational study is to an experiment requires a lot of work and knowledge of the subject. Even without a real or natural experiment, a scientist with enough experience in the field can combine case studies, and other observational data, to rule out potential confounders and make strong inferences.
The number of robust causal inferences from observational data in epidemiology and the social sciences is limited by the difficulty of eliminating confusion.
Everything must be supported by tests
Only leather and the wisdom of shoes can distinguish good assumptions from bad ones or rule out noise without deliberate intervention. These resources are scarce, so researchers based on observational data need qualitative and quantitative evidence. They should also take into account statistical principles and be on the lookout for anomalies, which may suggest sharp research questions.
No single tool is the best, you need to find the right mix of models, subject matter experts, and rigorous data scientists.
The objective is to overcome the era where we only ventured to talk about correlation and work hand in hand with our clients, in order to give a causal explanation to the performance of their campaigns, pieces, media and client segments.
In our next installment we will be talking about the different types of models that we apply at Grupodot to reduce noise and achieve real causality from experts.