Beyers Louw

Postdoctoral Researcher University of Groningen


On the Nuisance of Control Variables in Causal Regression Analysis [Open Access]

Control variables are included in regression analyses to estimate the causal effect of a treatment on an outcome. In this paper, we argue that the estimated effect sizes of controls are unlikely to have a causal interpretation themselves, though. This is because even valid controls are possibly endogenous and represent a combination of several different causal mechanisms operating jointly on the outcome, which is hard to interpret theoretically. Therefore, we recommend refraining from interpreting marginal effects of controls and focusing on the main variables of interest, for which a plausible identification argument can be established. To prevent erroneous managerial or policy implications, coefficients of control variables should be clearly marked as not having a causal interpretation or omitted from regression tables altogether. Moreover, we advise against using control variable estimates for subsequent theory building and meta-analyses.

Citation: Hünermund, P. & Louw, B. (2023). On the Nuisance of Control Variables in Causal Regression Analysis. Organizational Research Methods.

Double Machine Learning and Automated Model Selection: A Cautionary Tale [Open Access]

Double machine learning (DML) has become an increasingly popular tool for automated variable selection in high-dimensional settings. Even though the ability to deal with a large number of potential covariates can render selection-on-observables assumptions more plausible, there is at the same time a growing risk that endogenous variables are included, which would lead to the violation of conditional independence. This paper demonstrates that DML is very sensitive to the inclusion of only a few “bad controls” in the covariate space. The resulting bias varies with the nature of the theoretical causal model, which raises concerns about the feasibility of selecting control variables in a data-driven way.

Citation: Hünermund, P., Louw, B., & Caspi, I. (2023). Double machine learning and automated confounder selection: A cautionary tale. Journal of Causal Inference, 11(1), 20220078.

Working Papers

On the Choice of Control Variables in Empirical Management Research [AOM Best Paper Abridged Version]

Control variables have a central role when empirical data are used to support causal claims in management research. The current literature generally agrees that control variables should be chosen based on theory and that these choices should be reported transparently. However, the literature provides little guidance on how specifically potential controls can be identified, how many control variables should be selected, and whether a potential control variable should be included. Causal diagrams provide a transparent framework on how to answer those questions. This article delineates how causal graphs can inform researchers in strategy and management in finding the correct set of control variables and possible solutions in the case that causal identification is impeded by unobserved variables.

Citation: Hünermund, P., Louw, B., & Rönkkö, M. (2022). The Choice of Control Variables: How Causal Graphs Can Inform the Decision. In Academy of Management Proceedings (Vol. 2022, No. 1, p. 15534).

Is Knowledge Diffusion Really Slowing Down? Exploring Determinants of Highly Diffused Innovation

Researcher Influence in Algorithmic Choice

Group Decision-making in the Face of Grand Challenges