Special Report

Iowa Election 2023

Political Polling: Always Criticized, Rarely Analyzed

Image courtesy of Nasjonal Digital Læringsarena (NDLA)

The Iowa caucuses were first in the nation for forty years for a reason. A plethora of candidates, both on local and national levels, have launched successful political careers out of the midwestern state. Barack Obama recalled the night he won the Iowa caucus as “a more powerful night than the night [he] was elected president.” Coming in with few endorsements and going up against power-players such as Hilary Clinton, Obama ended up coming out on top by the end of the caucus. The political world may have been shocked, but the polling world wasn’t. Many  polling companies were able to predict this, some earlier and more accurately than others, which can be credited to a range of polling methods.


Political polling has historically been criticized. Whether it be in order to discredit losing numbers, or attack negatively reporting media sources, attempting to instill a distrust in the American people toward the polls is commonplace. This worked well for Trump in 2016, as his mix of attacks on the media and the polls they reported increased cynicism from the public. In response to Hilary Clinton leading in national polls, he posted on X, formerly Twitter:  “Any negative polls are fake news, just like the CNN, ABC, NBC polls in the election.” He quickly followed up, writing:  “I call my own shots, largely based on an accumulation of data, and everyone knows it. Some FAKE NEWS media, in order to marginalize, lies!”


What Mr. Trump may not have realized, at the time, is that these media organizations don’t produce these polls themselves. Rather, a host of independent organizations conduct the polling, selling their results to news outlets. These companies are selected due to their past accuracy, whose varying levels can be credited to the methodology used to conduct their polls.


So where do these numbers come from? And why, more often than not, are they correct? David Peterson, the Lucken Professor of Political Science at Iowa State University, knows it’s all in the numbers. “It’s a lot of stats. I can’t do justice to the degree of difficulty that goes into it, it is a very hard problem, and one that statisticians and survey researchers have been working on for a long time,” he says. His department helps produce Civiqs Iowa polls, one of the more trusted polls for Iowa’s caucuses. In the past, they’ve employed two separate polling methods, garnering them a respectable “B” grade from FiveThirtyEight.com


In the realm of political polling, two predominant methodologies have emerged, each with its own set of advantages and pitfalls: the traditional random digit dialing (RDD) phone surveys, and the newer approach of online polling using pre-established respondent pools. RDD is “the old, gold-standard of political polling,” according to Professor Peterson, “the problem is, nobody answers the phones anymore.”


This is due to the fact that the classic RDD method, championed by pollsters like J. Ann Selzer of Selzer & Co., involves sourcing phone numbers from the Secretary of State’s office and targeting registered voters. The process includes randomly selecting individuals from this list and reaching out to them via phone.


A decline in the number of household landlines over the past ten years has impacted the efficacy of RDD. This has led to firms like Civiqs, YouGov, and VeriCite, which employ a different strategy. Instead of relying on random sampling, these firms build pools of willing respondents who agree to participate in surveys in exchange for rewards such as points, gift cards, or cash. This approach aims to overcome the challenge of low response rates by engaging individuals who have voluntarily opted into the survey process.


However, both methodologies introduce biases of their own. RDD inherently skews towards those who still answer their phones, potentially excluding certain demographics. On the other hand, respondent pool-based polling caters to a self-selected group of individuals who have willingly joined the survey ecosystem, creating a sample that may not be fully representative of the broader population.


A critical concept in the polling landscape is achieving a “representative sample.” Despite neither RDD nor respondent pool-based polling being truly random, the goal is to create a sample that mirrors the demographic composition of the target population. This is accomplished through a meticulous process of weighting the data. Statisticians adjust the values of each respondent based on factors such as age, gender, education, race, and district to align the sample with the known population demographics.


For instance, if the survey over represents individuals with a college degree, statisticians assign a lower weight to these respondents to balance the data. Conversely, underrepresented groups are given higher weights. This statistical fine-tuning ensures that the final results reflect the true characteristics of the population under scrutiny.


The major problem lies in the data itself, though, or rather the lack thereof. As the data representing individuals in the population is collected from the US Census, some key factors are unable to be taken into consideration. The census doesn’t ask, for example, for an individual’s religious affiliation. More importantly, it also doesn’t ask for one’s political affiliation. This has created a large problem that’s forced statisticians and pollsters to find ways around, leading them toward other factors that can equate toward political association.


As technology evolves and communication patterns shift, the polling landscape will continue to adapt, seeking innovative ways to overcome inherent biases and provide a more accurate reflection of public opinion.


“It’s an impossible problem to solve, right?” Professor Peterson says, “Especially when you don’t know what the target population is.”