Photo by Rishi Deep on Unsplash
There is (for obvious reasons) a lot of talk at the moment about facts, and democracy, and how this is going badly.
This is technically good for me, this is my field! But also I feel uneasy — because I’m starting to suspect some of those familiar arguments have some big holes in them.
I wrote my MSc dissertation on regulation of referendum campaigns (you can read a tidied up version on methods of regulating content here). As part of the original dissertation I wrote a section on psychology research about the difficulty of correcting facts. A core concept here is the idea of a “backfire effect” — attempting to correct false information might in fact reinforce that bad information.
If this is true it should have policy implications. After all, if people are bad as dealing with basic factual information, some of our assumptions about people’s capacity to engage in democratic activity are overly optimistic and new approaches might be justified. A few years later I rewrote that section into a medium post here and to my delight it was well read! What a success!
The problem is when I re-read that piece recently I find myself more and more sceptical. This is for two reasons: I’ve read a convincing study failing to find much evidence of a general ‘backfire effect’, and against the general replication crisis in psychology I am worried about how much faith I had in citing study after study showing “interesting” reactions people had to these experiments. The second problem is bigger, so let’s talk about that first.
The replication crisis is in a nutshell the failure of attempts to re-run previous, high-profile psychology experiments and achieve the same result.
The strong suspicion is the source of this problem wasn’t a problem with the original experiments — but the structure of science around them. Publishing incentives for researchers lead to positive studies being published that may have been flukes — while the more boring findings of ‘no effect’ were stuck in the drawer. This is called publication bias and the aggregate effect of this is that the field forms a very different idea of what is ‘true’ because all the negative information is missing. Like the iceberg beneath the surface, this absence is hard to detect until you try to replicate studies — and find you can’t.
You can also just ask people if in their work they practice methods more likely to lead to ‘interesting’ results — the conclusion of one such survey of psychologists was “questionable practices may constitute the prevailing research norm”. This is far short of outright fraud, but the effect of all these small distortions warp our understanding of what is really going on.
In an high-profile example the theory of “ego depletion” (roughly, that willpower is a resource that is depleted through use) has fallen from long-standing mainstream theory to deeply questionable based on failure to replicate. The concerning part of this is it wasn’t a new idea, and had previously been found across hundreds of other experiments. Failure to replicate now suggests something was deeply wrong the structure of science that produced the positive findings.
Publication bias and problems with replication aren’t unique to psychology — although there are reasons to believe it has more problems than others. A study of replication in economics was only able to replicate around half of the papers attempted — the authors concluding that “economics research is usually not replicable”. One study found ‘p-hacking’ (methods to inflate the probability of creating statistically significant results) was widespread across a number of fields — although encouragingly they also found that meta-analysis (a way of combining multiple studies to determine the aggregate opinion of published research) seemed relatively unaffected by this.
This is a recoverable problem, but a serious one. And for the moment we as consumers of social science need to be careful about how we import academic papers into arguments in the outside world (and be suspicious when we see the authority of psychology being used as important support for an argument).
Returning specifically to the backfire effect. Last year Wood and Porter published a study replicating a specific instance of the backfire effect in a previous study — but not finding it in a number of additional instances that should be produced one. Their results instead indicated that “[b]y and large, citizens heed factual information, even when such information challenges their partisan and ideological commitments.” More detail:
Four of the misstatements came from Democrats; four came from Republicans. On no issue did we observe backfire. Regardless of their party and ideology, the average subject who saw a correction brought her views closer in line with the facts. For the second study, we identified issues about which politicians from both ends of the ideological spectrum had made misstatements that could be corrected by reference to neutral data. For example, political leaders from both parties have made erroneous claims about the abortion rate and immigration enforcement. Yet once again, regardless of party or ideology, the average subject exposed to corrective facts was made more in agreement with these facts, even when doing so required her to reject a co-partisan leader’s misstatements. […]
Subjects in this study were also exposed to a replication of the original Nyhan and Reifler news article about WMD in Iraq. When presented with the original survey item that Nyhan and Reifler used to measure backfire, subjects did indeed backfire, rejecting the facts presented to them. Yet subjects shown a more succinct survey item did not display backfire. In all other cases in Study 3, regardless of party or ideology, the average subject exposed to a correction expressed greater agreement with the facts than those to whom no correction had been vended.
This leads to a more nuanced picture of what is happening when you correct someone’s incorrect belief, and one that is more optimistic about the ability of citizens to correct prior beliefs.
In isolation I’d look at this paper failing to replicate the backfire effect with a ‘hmm, I’ve read several examples of it working in other papers’. In the context of a known problem with false-positives I feel I need to be more sceptical of the idea of backfire than one paper suggests.
My current inclination is that if A publishes X and B publishes not-X, the risk of publication bias should lead us to weight B higher because it is more likely C also found not-X as well but hasn’t published. This might be unfair (B might be a genuine fluke) but from the outside we need heuristics for guiding us through an environment where things claimed to be true are likely not to be.
I find it difficult at the moment to read any article rattling off psychology findings as part of an argument without “replication crisis”, “ego depletion isn’t real” running through my mind. Given we know there are factors in psychology pushing towards “interesting” results, results escaping psychology into the wider consciousness need to be viewed carefully.
Once a study has been published, at each successive remove from the original research (which might be locked in a paywall away from the general reader) ideas lose nuance and qualifications that may be important. While sensationalist journalists are often blamed for misleading news stories, the role of press releases in creating a more interesting version of the information is important. Sumner and colleagues found that 40% of press releases were more explicit than the published paper and that news stories exaggerating beyond what the paper claimed were usually making claims that could be found in the press release. They also found that in the absence of exaggeration in the release, exaggeration by journalists was rare (10–18% of cases).
These problems together could result in a kind of money-laundering-for-ideas: Many psychologists are interested in a concept because it is latent in the culture that year, they perform experiments to test for it, some achieve positive results and publish (others do not and do not), and good press officers inject a salient take of findings back into the general debate. Suddenly we’ve got a nice, clean, scientific version of the idea that before was just a collection of assertions. But nothing has really changed.
That’s a hypothetical pipeline, but it worries me because we have evidence that most stages of it happen.
Another common issue in psychology is that many experiments are based on studies of undergraduate students (who are conveniently available). This is fine to an extent — but problematic when you start to generalise beyond students. This meta-analysis shows that college undergraduates are not representative of the wider population and effect sizes might be different directions and in different magnitudes. Importantly there is no systematic difference and so you can’t just apply a correction to information learned from undergraduates. Results need to be replicated with more general groups before we can really say we’ve learned anything.
So when an article opens its first paragraph gravely intoning about a problem revealed in democratic society by studying undergraduates — my guard is up. Maybe it’s all true! But given systematic problems in the field, why should I believe anything based on anything with even one red flag? Ego depletion might not be real, but goodwill depletion is.
The problems of psychology are of importance to non-psychologists because of the role the field has in our culture. Psychology matters because how we as a society think people “work” effects how we treat each other. If ego depletion is real, it should guide how we view approaches to health matters like obesity. If it is not, then policy acting as though it is will be ineffectual (or possibly even have a harmful result). The work psychologists publish is recruited as expert evidence for problems in our culture. It’s not just that we have lying politicians (a social problem), we are psychologically ill-equipped to handle them (a difficult unfixable problem — if true).
To use myself as an example, my argument that there is a role for more active truth-policing in elections is supported by the idea counter-information is less effective if lies are continuously told. If this is not true, then (while I think there’s other good reasons) the case is weaker. Constructing political institutions requires an understanding of human nature — we need to be suspicious of the idea that our ‘understanding’ is more scientific than that of previous ages just because we have psychology papers to cite.
While science is in the long-run is self-correcting, this is only reassuring in that it is likely our grandchildren will be more correct than we are. Max Planck argued that “A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather its opponents eventually die, and a new generation grows up that is familiar with it”. It turns out you can actually track this — there is a burst of innovation in a field after the death of a key figure.
Science is a human endeavour, with human problems. Sometimes scientific progress needs to wait for better microscopes to be developed, but when progress is obviously being slowed by entirely human forces we can make this long run shorter. Publication bias is an addressable problem and one that fields with as high aspirations as psychology need to get a handle on to get a default deference for their claims.
I’m left perfectly willing to be convinced of the reality of a backfire effect (it would make work I’ve done more relevant!) — but very cautious about proselytising about it. I think how people accept and process information is going to be quite complicated — and is an ideal area for scientific exploration. I just don’t have the confidence in the field to understand how much we actually know here.
As the Abbot said in A Canticle for Leibowitz: “Men must fumble awhile with error to separate it from truth, I think — as long as they don’t seize the error hungrily because it has a pleasanter taste.” I think at the moment we are enjoying the taste of certainty a bit too much.