This is the first part in a two part look at how FHQ's weighted average stacked up in examining the 2012 electoral college. We'll first take a global look at FHQ in the context of the other models out there. Part two will take a micro view of the FHQ model in relation to the electoral college results.
Now that we are nearly a week removed from the re-election of President Obama, FHQ thought it would circle back around and take a look back at how we did in examining the state of play within the electoral college. The answer is not too bad. What was 49 out of 51 correct state-level projections based on our simple weighted average in 2008 morphed into a perfect 51 out of 51 score in 2012.
FHQ was not alone. Drew Linzer (Emory) at Votamatic and Simon Jackman (Stanford) blogging for the Huffington Post and Sam Wang (Princeton) at the Princeton Election Consortium all were either right on or in Wang's case cautiously calling a tie in Florida. [And truth be told, Florida was a tie, but one that consistently -- around FHQ anyway -- ever so slightly favored the president. Again, we're talking about a decreasing fraction of a point as election day approached.] Oh, and Nate Silver of FiveThirtyEight fame pegged it at 332-206, too. This was a great thing for the so-called "quants".
Despite that, there are a couple of notes that are floating around out there and are worth mentioning.
1) FHQ won't take any victory laps because of this.1 Don't get me wrong. It is nice to be a dart and not, say, the board itself, but this actually has very little to do with what FHQ was doing under the hood -- or what any of the above folks were doing, for that matter. If we were all making sausage, then FHQ and the others were merely turning the crank on our various sausage making apparati. The filling -- the polls -- was what really nailed the election projection on the state level.2 Drew first published his model in June. FHQ followed in July. The polls, even through our different lenses. told the story then. 332-206. Over the course of the summer and into the fall, that changed very little. For FHQ, Florida got as close as 0.04 points in favor of the president, but then took a turn back toward Obama. That was it. The Sunshine state was always the only state that ever truly threatened to jump what FHQ calls the partisan line into the Romney group of states. The polls were not only right on the money, but they were overall, pretty consistent. Jim Campbell's argument/observation that the September polls are a better predictor of November election outcomes came to pass. What we got in October was just noise before state-level polling reverted or began reverting to those post-convention, pre-October numbers.
As FHQ asked throughout October, were we witnessing a movement toward Romney in the polls or the typical sort of narrowing (Campbell 2008) that tends to mark the late campaign polls. The latter may not have been the true answer but it was closer than simply talking about Romney's momentum. Tom Holbrook's (1996) equilibrium theory of candidate support through the polls seems to have been the correct lens through which to view the dynamics of the race as election day drew nearer.
Score this one for the polls, then.
2) But where does that leave the models? After all, the sausage maker has some utility, too. Well, FHQ's natural inclination is to piggyback on the above point and state the obvious. The polls were right on and you didn't really need a statistical model -- complex or otherwise -- to accurately project the electoral college. In true self-deprecating fashion (Bear with me. I'll get there.) -- something FHQ is good at -- our little ol' weighted average was accurate enough to get all but two states right in 2008 and every last one in 2012. Again, it was the polls. In fact, if you removed the weighting and took the raw average of all the publicly available polls released on and before election day in all of 2012 you would come up with the same thing: 332-206. As I told Drew over the summer in a brief Twitter exchange, my hope was at that point just after the conventions that the race would tighten up so that we could, in fact, get a true measure of the utility of the more complex statistical models projecting the electoral college. As it stood then -- and how it ended up even with some narrowing -- there was a lot of overlap between the Bayesian models and the more pedestrian averages.
Mind you, I'm not saying that there is no place for these models. Boy, is there. I'm with John Sides on this one: The more models we have, the better off we all are on this sort of thing. Rather, my point is to suggest that the simple averages are a decent baseline. As November 6 approached and the FHQ numbers did not budge in the face of changing information following the Denver debate, I began to think of the FHQ weighted average like the Gary Jacobson measure of congressional candidate quality. Now sure, there have been herculean efforts littering the political science literature to construct multi-point indices of candidate quality, but they don't often perform all that better than Jacobson's simple test. "Has challenger/candidate X held elective office?" That simple, binary variable explains most of the variation in the levels of success that various candidates -- whether challenging an incumbent or vying for an open seat -- have enjoyed across a great number of elections. The multi-point indices only slightly improve the explanatory power.
Now, lord knows, I'm not trying to draw definitive comparisons between the work here at FHQ and Jacobson's oft-cited body of work. Are there parallels? Yes, and I'll leave it at that. Sometimes the best models are the simplest ones. Parsimony counts and to some extent that is what FHQ provides with these electoral college analyses. And again, the reason I was hoping that the polls would tighten as we got closer to election day was to demonstrate just exactly how much better the more complex models were. My expectation was that there would be a noticeable difference between the two. But there wasn't; not in terms of projecting which states would go to which candidates. By other measures, the more complex models wiped the floor with FHQ (as, admittedly, they should have).3
The tie that binds all of these models -- if you really want to call the pre-algebra that FHQ does a model -- is a reliance on polling. And that raises a different question as we shift from reviewing 2012 to looking at 2016 and beyond. The quants "won" this one. But it was not without a wide-ranging -- and fruitful, I think -- discussion about the accuracy of polling. The one question that will continue to be worth asking is whether the seemingly perpetually dropping rates of response to public opinion polls continue to drop and what impact that will have. If that continues, then there would almost certainly have to be a tipping point where phone-based polls begin to more consistently miss the mark. The good news moving forward is that the online polls -- whether YouGov, or Google Consumer Surveys or Angus-Reid -- performed quite well in 2012; offering a ray of hope for something beyond phone polls in a time when cell phones are hard to reach and landlines are disappearing.
Still, we are now at a point where pollsters are talking about the "art of polling" as a means of differentiating from other pollsters instead of the overarching science of polling. That has implications. If all pollsters guess wrong about the underlying demographics of the electorate, all the polls are wrong. Of course, the incentive structure is such that pollsters want to find something of a niche that not only separates them from the competition to some extent but helps them crack the code of the true demographic breakdown of the electorate. [Then they can all herd at the end.]
The bottom line remains: these projections are only as good as the polling that serves as the sausage filling. If garbage goes in, then garbage is more likely to come out. On the other hand, if the polling is accurate, then so too are the projections.
1 I won't take any victory laps, but I will extend to all of those who have been both loyal and happenstance readers alike a very sincere thank you for spending some or all of election season with us. And yeah, that stretches back to late 2010. Thank you.
2 This is something Harry Enten of the Guardian mentioned via Twitter on Saturday and AAOPR more or less confirmed today.
3 One factor that should be noted here that may separate FHQ from the more involved models is polling variability. 2008 was witness to a great deal of polling variability. The margins in that open seat presidential election jumped around quite a bit more than in 2012 when an incumbent was involved. 2016, in some respects is shaping up as a repeat of 2008. That is even more true if both Hillary Clinton and Joe Biden pass on runs for the Democratic nomination. Both races would be -- at least from our vantage point here three years out -- wide open and influence the polling that is conducted across firms and across states. Yet, even with that unique situation, FHQ lagged just one correctly predicted state -- North Carolina -- behind FiveThirtyEight.