Barren proxies - Biased and Inefficient

In causal inference it is often the case that you can’t obtain a confounding variable directly, you can only measure something that it affects. Judea Pearl correctly points out the danger of conditioning on a ‘barren proxy’ for a confounder, in situations like this one:

A confounds the effect of B on C. D is affected by A but does not directly affect either B or C, so it is a ‘barren proxy’ for A.

It’s easy to see that conditioning on D will not, in general, remove confounding by A. The problem, as so often with causal graphs, is where to draw the boundaries. Examined sufficiently closely, almost every variable in statistics is a barren proxy.

Suppose A is average particulate air pollution dose for people in a city and D is measured particulate air pollution concentration. The standard measurement technique is to force air through a filter and trap the pollution particles, which are then weighed. The particles that end up on the filter cannot have any effect on health; measured air pollution is a barren proxy for exposure.

Suppose A is blood glucose concentration. A blood drop is removed and fed into a testing device. The glucose in that drop of blood doesn’t participate in future chemical reactions in the body; measured glucose is a barren proxy.

Suppose A is whether or not you have had a heart attack, and D is the conclusion from expert examination of your medical records. A subsequent examination of the medical records can’t possibly have an impact on whether your heart muscle cells actually died during the event; diagnosis of heart attack is a barren proxy.

As a simple matter of fact, essentially all measured variables in medicine are barren proxies. That’s not the important distinction, though. What actually matters is whether there are good causal reasons that the relationship between the true confounder and the measured variable is close, not just observationally but under intervention. That is, we should care whether D is a reliable measurement of A, not whether it is a barren proxy for A. Unfortunately, the criteria for being a reliable measurement are not simple and qualitative.