Thursday, March 06, 2008

Presidential Election 2008 - McCain v/s Obama - Rev 0

This is the first in a series of posts that will present results from a statistical simulation model to predict the outcome of the 2008 presidential election. Although the Democratic party's nominee has not yet been chosen, I am using Senator Obama as a placeholder since he is the current leader in delegate count. The data for state-by-state voter preferences in a race between Senator McCain and Senator Obama are taken from the Survey USA poll of March 06, 2008.

As in my previous work during the 2004 election cycle (see earlier posts in this blog),
I use a statistical simulation model to account for the uncertainty inherent in polls with small sample sizes (with ~400-500 samples in most cases). Because of the statistical margin of error inherent in all polls, the reported voter preferences are only one of many likely scenarios. Using Monte Carlo simulation, I can generate many such scenarios, and tally the results to predict the range of likely outcomes as well as the probability of each outcome.

Here are the key features / assumptions in my model.

  1. On a state-by-state basis, undecided voters are allocated equally between McCain and Obama.
  2. For each simulation, the likely percentage of Obama votes is calculated by assuming it follows a normal distribution with (a) mean based on the poll results plus undecided allocation and (b) standard deviation based on the polling error. I am assuming the polling error to be 4% based on data from the 2004 election - as SUSA does not provide this information.
  3. The winner for each state is allocated all of the state's electoral votes (with the exception of Nebraska, where the winner of popular votes gets 3 delegates and the loser gets 2 delegates).
  4. The results are aggregated for 5000 simulations to calculate: (a) average number of electoral votes for Obama, and (b) probability of Obama winning more than 270 electoral votes.
  5. Clarification - The mean of the distribution of Obama electoral votes is different from the "best guess" electoral vote total obtained by adding the state-by-state totals calculated from the mean estimate of voter preferences + undecided allocation. The "best guess" number is also the result reported by SUSA, and does not take into account the impact of sampling error in the polls.
RESULTS - If the elections were held today, there is an 89% chance that Senator Obama will be elected. The mean (average value) of the distribution of electoral college votes for Obama is 302, and the mode (most likely value) of this distribution is 295.

Note that these are different from the SUSA "best guess" number of 280, primarily because the difference between voter preferences for Obama and McCain in such key states as Texas and Florida is smaller than the assumed sampling error of 4%. In other words, there is a non-negligible probability that these states could go for Obama rather than McCain, thus increasing the mean and most likely electoral vote count for Obama.


Scroll down for a graph showing the projected distribution of Obama's electoral votes, with the blue bars denoting the outcomes corresponding to a Obama victory.



Clearly, this is very early in the election cycle, and much will change between now and November. Needless to say, the usual caveats about using pre-election polls for predictive purposes with a healthy dose of skepticism apply here as well. I will keep updating these results as more and more state-by-state voter preference polls become available. Hopefully, the results will become more stable as election day approaches.

Until then, 10-4.

0 Comments:

Post a Comment

<< Home