My latest Democratic Nomination Predictions

I added a PredictIt variable to the model. I use the PredictIt close price for Clinton (percent chance of winning) the day before the election. Then I convert it into standard deviations (which makes it linearly correlated with the predicted percentage). Turns out it is a fairly minor factor, except for caucuses.

I also updated my poll numbers and 7 day search trend data.

Take this with a huge grain of salt. Hopefully my real-time county swing model will do better.

I haven't figured out how to calculate the standard error, as I'm using Weighted Least Squares (and my turnout model's turnout-votes as the weight). Given past results (a very small sample), it seems to be around 3-5%.

CT: Clinton 54.4 / Sanders 45.6 (Pollster avg is 51.3 / 48.7)
DE: Clinton 59.2 / Sanders 40.8 (Pollster one poll is 54.2 / 45.8)
MD: Clinton 70.1 / Sanders 29.9 (Pollster avg is 60.6 / 39.4)
PA: Clinton 54.6 / Sanders 45.4 (54.8/45.2 before the poll that came in after my original analysis) (Pollster avg is 58.5 / 41.5)
RI: Sanders 51.8 / Clinton 48.2 (Pollster avg is 50.6 / 49.4)

As a reminder, this model includes FB likes, Google Search Trends, race, income, age, sex, population density, past election results, education, cyclists, and the most recently added PredictIt day-before variable.

So I'm somewhat close to the poll average, except for MD.

How the model did

My model beat the Pollster average!

CT: Clinton 54.4 / Actual 52.7 - Error -1.7
DE: Clinton 59.2 / Actual 60.5 - Error +1.3
MD: Clinton 70.1 / Actual 65.5 - Error -4.6
PA: Clinton 54.6 / Actual 56.1 - Error +1.5
RI: Clinton 48.2 / Actual 44.0 - Error -4.2
Avg error 2.66

Overall Sanders did slightly better than the model forecast.

Pollster Avg
CT: 51.3, Actual 52.7, Error +1.4
DE: 54.2, Actual 60.5, Error +6.3
MD: 60.6, Actual 65.5, Error +4.9
PA: 58.5, Actual 56.1, Error -2.4
RI: 49.4, Actual 44, Error -5.4
Avg error: 4.28

Tyler's model had an avg error of 1.72!

I'm not quite sure how you can convert the avg error into a standard deviation. I suspect the size of the standard deviation is larger for under-polled and/or small states like RI, CT, and DE. If you assumed a constant standard deviation size, then 50% of your results would be within 0.67 std deviations. So my std deviation size could be approximately 0.67 * 2.66 = 1.78%. And Tyler's would be even lower.

Tyler Pedigo's predictions

Tyler Pedigo's predictions are very close to mine

Benchmark Politics has CT:

Benchmark Politics has
CT: Clinton 55 / Sanders 45
DE: Clinton 60 / Sanders 40
MD: Clinton 66 / Sanders 33
PA: Clinton 57 / Sanders 43
RI: Clinton 53 / Sanders 47

They have a county level model that is 1/4 polls and 3/4 demographics.

Benchmark Politics

They updated their predictions, notably in Rhode Island:

Connecticut: Clinton 55% - Sanders 45%
Delaware: Clinton 60% - Sanders 40%
Maryland: Clinton 66% - Sanders 34%
Pennsylvania: Clinton 57% - Sanders 43%
Rhode Island: Clinton 50.25% - Sanders 49.75%

Another set of

Another set of predictions
Of Chaos and Order