Journalists Reporting on Polls and Forecasts Should Be Required to Pass Statistics 101

0
August 23, 2012 at 1:54 pm  •  Posted in Predictive Analytics by  •  0 Comments

Political Science professors from the University of Colorado recently published a study describing a model they constructed to forecast the result of the upcoming Presidential Election.  They used various economic indicators, e.g., national and state unemployment rates, along with a handful of other variables to account for factors such as home state advantage and whether or not a candidate is an incumbent.  I read the paper and, while I’m no political scientist, I think it’s a reasonably good piece of research.  In the paper, the authors state that “the model successfully predicts every Electoral College victor“.  This is a fair statement.  However, in the Denver Post article the authors are quoted as saying “ it correctly forecast every winner of the electoral since 1980” and in the Huffington Post article the quote reads, “ has correctly forecast every winner of the electoral race since 1980“.  I realize that the differences are subtle but the quotes used by the Denver Post and the Huffington Post can lead to a complete misinterpretation.

In the research article the authors make it clear that the model was calibrated so that it successfully predicts the outcome of each election going back to 1980.  But, the news articles from the Denver/Huffington Post reports the model’s accuracy as if it’s been in use since 1980 and has never been wrong in previous efforts to report election results.  This is dangerously misleading.

If you’ve built a predictive model before you realize that, within reason, you can use a few tricks here and there to have the model work just the way you want it to work.  Now, I’m not saying that the authors from the University of Colorado wanted a model that would forecast a Romney victory.  No.  What I’m saying is that the authors wanted a model that accurately forecast the winner of every previous election in their study.  In other words the historical accuracy of the model wasn’t an accident (“Hey, it turns out that our model – produced in a vacuum – has worked for 30 years!”); rather, the model was developed in order that its forecasts would have been accurate.  Now, the authors of the research don’t make such a claim but the journalists who reported the story fail to clarify how the study should be interpreted.

So, what’s the point of this blog post?  Well, this type of reporting is potentially dangerous just like early reporting of exit polling results can be dangerous.  This information may influence voters  - they may choose not to vote (for either candidate – a supporter may think his candidate has the election in the bag and doesn’t bother voting or may believe his candidate doesn’t stand a chance and may not vote).  In any case, journalists who report on predictive models, and polling results for that matter, need to have a better understanding of statistics so that they are able to report these studies with greater precision and with appropriate context.  They do their readers a disservice when they report forecasts too loosely.

In addition, I think the researchers have a responsibility, especially in their capacity as public employees, to clarify how the model was constructed and what they mean, exactly, by stating that it has been correct going back 30 years.  They were clever to publish a study that would garner plenty of media attention and I commend their effort to do salient applied work rather than something more theoretical and obscure.  That said, they should be certain that their research is properly understood by the public they are paid to serve.

 

Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>