Thursday, March 06, 2008

Presidential Election 2008 - McCain v/s Obama - Rev 0

This is the first in a series of posts that will present results from a statistical simulation model to predict the outcome of the 2008 presidential election. Although the Democratic party's nominee has not yet been chosen, I am using Senator Obama as a placeholder since he is the current leader in delegate count. The data for state-by-state voter preferences in a race between Senator McCain and Senator Obama are taken from the Survey USA poll of March 06, 2008.

As in my previous work during the 2004 election cycle (see earlier posts in this blog),
I use a statistical simulation model to account for the uncertainty inherent in polls with small sample sizes (with ~400-500 samples in most cases). Because of the statistical margin of error inherent in all polls, the reported voter preferences are only one of many likely scenarios. Using Monte Carlo simulation, I can generate many such scenarios, and tally the results to predict the range of likely outcomes as well as the probability of each outcome.

Here are the key features / assumptions in my model.

  1. On a state-by-state basis, undecided voters are allocated equally between McCain and Obama.
  2. For each simulation, the likely percentage of Obama votes is calculated by assuming it follows a normal distribution with (a) mean based on the poll results plus undecided allocation and (b) standard deviation based on the polling error. I am assuming the polling error to be 4% based on data from the 2004 election - as SUSA does not provide this information.
  3. The winner for each state is allocated all of the state's electoral votes (with the exception of Nebraska, where the winner of popular votes gets 3 delegates and the loser gets 2 delegates).
  4. The results are aggregated for 5000 simulations to calculate: (a) average number of electoral votes for Obama, and (b) probability of Obama winning more than 270 electoral votes.
  5. Clarification - The mean of the distribution of Obama electoral votes is different from the "best guess" electoral vote total obtained by adding the state-by-state totals calculated from the mean estimate of voter preferences + undecided allocation. The "best guess" number is also the result reported by SUSA, and does not take into account the impact of sampling error in the polls.
RESULTS - If the elections were held today, there is an 89% chance that Senator Obama will be elected. The mean (average value) of the distribution of electoral college votes for Obama is 302, and the mode (most likely value) of this distribution is 295.

Note that these are different from the SUSA "best guess" number of 280, primarily because the difference between voter preferences for Obama and McCain in such key states as Texas and Florida is smaller than the assumed sampling error of 4%. In other words, there is a non-negligible probability that these states could go for Obama rather than McCain, thus increasing the mean and most likely electoral vote count for Obama.


Scroll down for a graph showing the projected distribution of Obama's electoral votes, with the blue bars denoting the outcomes corresponding to a Obama victory.



Clearly, this is very early in the election cycle, and much will change between now and November. Needless to say, the usual caveats about using pre-election polls for predictive purposes with a healthy dose of skepticism apply here as well. I will keep updating these results as more and more state-by-state voter preference polls become available. Hopefully, the results will become more stable as election day approaches.

Until then, 10-4.

Back from Hibernation

Hello y'all!

My last post was some 3 1/2 years ago - the day before the 2004 presidential elections. Needless to say, I got burnt by my assumption that undecided voters would go for Kerry by a 2:1 margin. Before going to press, I had run a simulation where the undecided vote was equally split - with the result that Bush was going to win with a 279-261 margin. Hindsight being 20:20, I wish I had picked this scenario as my "most likely prediction". C'est la vie.

Here we are in 2008, with yet another presidential election upon us. As we wait for the Democratic party's nomination process to sort itself out, Survey USA has come out with a 50-state poll pitting McCain against Obama as well as McCain against Clinton. In the next days and months, I will be updating my model for the 2008 elections beginning with the SUSA polling data.

Until then, 10-4.