3 min read

Interviewing your laptop

There’s increasing interest in (or, at least, coverage of) the idea that surveying large language models could substitute for surveying people. This has obvious advantages: to repurpose a Heinlein quote “it’s cheap, clean, convenient, and free of any possibility of wrongdoing–and you don’t have to go home in the cold”. The problem is I can’t see how it could possibly work.

There is a science-fiction book, Interface, by Stephen Bury1, about how election campaigns are too polling-driven. A campaign consultant, Cy Ogle, breaks down the US voting population into 400 micro-demographics: a few of these are irrelevant mouth breather, 400-pound Tab drinker, stone-faced urban homeboy, burger-flipping history major, squirrelly winnebago jockey, bible-slinging porch monkey, economic roadkill, pent-up corporate lickspittle, high-metabolism world dominator, midamerican knickknack queen. Yes, they’re vile labels. He’s a vile person.

He recruits one member of each micro-demographic and, using sufficiently advanced technology2, hooks them up to a computer so he can read out the country’s opinion in real time. There are two sorts of problem with this. First, the people might not be as representative as you wanted. In the book, the ‘high-metabolism world dominator’ loses his wristwatch in a car crash and it’s picked up by someone completely different. Also, the ‘economic roadkill’ representative gets interested in politics, makes up this weird conspiracy theory about the candidate being controlled by a chip in his head, and decides to rescue the country from him3.

The second problem is weighting: how many high-metabolism world dominators are there in the current US voting population vs burger-flipping history majors. If you don’t know that – and if it can change – the micro-demographics aren’t enough.

Interviewing LLMs has both problems. You can ask Claude or ChatGPT to pretend to be a white grocery-store owner in Indiana, or a Mormon potato farmer in Idaho, or whatever, give it the news, and ask it who it would vote for. It will probably do this better than I could, but it’s not magic. If the political opinion experts don’t know4 how people would react to the news, it’s hard to be confidence that AI will. In particular, it’s hard for demographics who are not Very Online, and these are exactly the people who are hard to survey. Getting good answers for all the viewpoints is the hard part of opinion polling.

Even if Claude or ChatGPT is correct about the response of individual imaginary people to the news, that doesn’t solve the weighting problem. How do you assemble individual survey responses, even if they are reliable, into a population (or state or district) summary? This is the other hard part of opinion polling, using a combination of random sampling and auxiliary population information, and it’s where LLMs are not going to help. Even if you stipulate than an LLM could successfully pretend to be any sort of potential voter, there’s no way that it can know the distributions. That’s not in the training data, which are a large convenience sample from the internet, and not referenced to the US voting population in any way.


  1. who is Neal Stephenson and J. Frederick George↩︎

  2. a wrist TV with pulse and skin-resistance monitor↩︎

  3. Yes, the candiddate is actually being controlled by a chip in his head, but a belief can be true without being warranted↩︎

  4. ex hypothesi, if we’re doing surveys↩︎