STATS ARTICLES 2008
Political polls: When up doesn’t always mean up
Rebecca Goldin, Ph.D, October 15, 2008
There is only one truly accurate presidential poll: the one voters take on election day
Polls abound in October of a presidential election year, giving some folk hope and others grief. In theory, the majority of people are given hope, as their candidate is the one shown to be ahead in the polls – in other words, those who favor the candidate down in the polls are in the minority. But in practice, polling is a complicated process, and can be rife with errors.
How else is it possible that Obama and McCain have wildly different levels of support, depending on whom you ask? According to RealClearPolitics, Rassmussen says that Obama is ahead, with 50 percent of the vote compared to just 45 percent for McCain, with 5 percent undecided or voting for others. In contrast, Gallup’s Traditional (Expanded) poll claims that Obama is ahead by ten percentage points, with Obama claiming 53 percent support, and just 43 percent for McCain. The polls were taken in the same time period (10/11-10/13), so what’s behind the gap?
One might point to sampling error (also called “margin of sampling error” or simply “margin of error”), which is a kind of error that we can expect from any poll – it is error that comes from the fact that we only polled some voters and not all of them. The more people who are polled the smaller the sampling error. The correct interpretation of the margin of sampling error is that it is measuring the range of values that we can be 95 percent confident reflects the true percentages of the whole population. As Rasmussen says of its polls,
“Daily tracking results are collected via telephone surveys of 1,000 likely voters per night and reported on a three-day rolling average basis. The margin of sampling error—for the full sample of 3,000 Likely Voters--is +/- 2 percentage points with a 95% level of confidence.”
This means that if the poll were conducted one hundred times (in the same circumstances, on the same day, etc), and a (possibly) different set of 3,000 likely voters answered the phone over a three-day period, we would expect to find that the results lie within the range of 48-52 percent support for Obama, and 43-47 percent support for McCain 95 times – and the other five times, we would expect to find a result outside of those ranges. This means that even in an ideal world the poll might get it wrong; there is no outcome that is technically impossible three weeks from now.
But there are other kinds of error in a poll that can be implicit or explicit in the process of polling. One of the big issues is who is sampled. Gallup is currently listing two polls, one “traditional” poll of likely voters, which consists of those respondents who have previously voted, and an “expanded” likely voter pool, consisting of respondents who are likely to vote but have not necessarily voted in a prior election. One could argue that those who didn’t vote in the last election are unlikely to be voters this time around – but there have been record numbers of new voters registering, and substantial evidence that young people are enthusiastic about voting in this election . Perhaps their voices should count in the polls (in both cases, Gallup reports an Obama lead – but the amount of the lead differs significantly).
Another kind of error implicit in the process is the fact that the poll is conducted by a landline telephone. Gallup claims that cell phone users who do not have a landline are included in their polls – but many polls do not (such as those by Rasmussen). Cell phone users tend to be younger and more transient, and are more likely to be minorities. Other errors in the process include the bias toward those who answer their phones, those who are home in the evening, and those who like to respond to polls.
And who is a likely voter anyway? It wouldn’t be popular to sport the notion that, independent of voter registration or previous voting behavior, some people are less likely to vote than others. One of the reason could be long lines. and the fact that people in districts with long voting lines tend to be poorer and more likely to be black or Hispanic, or in college. In 2004, there were stories of lines up to nine hours long.
Finally, there is a question of whether poll respondents are honestly reporting their voter intention. Much attention has been given to the question of the Bradley effect – will likely voters merely say they plan to vote for Obama, but once at the polls actually cast their vote for McCain? While this behavior has been ascribed to a racial preference because of the most famous case it describes, the same behavior could be observed for a variety of phenomena. For example, if there is evidence of a new security threat against the United States, a voter favoring Obama because he likes his message about the economy may turn to McCain because he feels safer under threat.
This last point gets at the heart of the excitement for the weeks to come. A variety of uncertainties could tip the vote in another direction; just as the economic crisis seems to have hurt McCain, a scandal or a different sort of crisis could end up hurting Obama in the next few weeks. While polls are saying that Obama is in the lead, it’s not over until, well, someone is singing.
Predicting the presidency: Divorced women, regression modeling, or coin tossing? A mathematician’s take
We now have dozens of political scientists telling us that their formulas will predict the winner. But they all face one rather big problem.