Introduction:
The goal of this modeling project was to predict the outcome of the 2024 Election using public health, demographic, and historical data. The unique approach relies on predictors that are a proxy for public support for the Democratic party within a population. In the U.S., we are faced with a binary choice, Democrat or Republican, and our election is decided by Electoral votes from each state. Therefore, the response metric predicted was simply the margin of victory within a state. Because of the electoral college, predicting the election is essentially a matter of predicting a handful of states. Most states have a reliable history of a wide margin of victory for one party or the other, while a few do not. The data and model will be accurate to the extent that they accurately predict these states. Due to the smaller sample size of recent national elections and the importance of recent data points in the model, it will not be capable of producing highly precise predictions for states with slim margins of victory. Therefore, the success of this model will hinge on its ability to detect which swing states might have more support for Democrats (or Republicans) than what is currently being detected in the polls.
Background and Assumptions:
Over the last two presidential election cycles, we have seen public polling fail in major ways. In 2016, almost every major polling and media outlet failed to detect the degree of public support among Democrats and Independents that led to Trump's victory in key swing states and the Rust Belt. In 2020, polling outlets again underestimated Trump's support in key states. Since then, trust in the media's ability to investigate and get at the truth has further eroded.
This analysis seeks to find predictors that reflect a more accurate state of the public's political preference that are not subject to the weaknesses of the polling industry's biases. Due to the hyper-polarized nature of the Covid-19 pandemic, and the explicitly clear lines where support for the Covid-19 shot fell, public uptake for each year's "new" version of a Covid-19 shot is highly correlated with support for the Democratic party. Because there is a new Covid-19 shot every year, continued uptake is assumed to indicate Democratic vote allegiance. Other indicators, such as domestic migration rate and mail-in ballot requests, are strongly correlated with Democrat support over the last four years. In addition, population data from public health sources have been used as controlling or predictive variables, including mortality rate, birth rate, and mental health. Some demographic and population dynamics are associated with more Republican-leaning states and others with Democratic-leaning states, and these relationships have held over time in recent history. Other measures, like net migration rate, have strong associations, but those are more recent and were affected by the Covid-19 pandemic, during which many locked-down blue states saw a net loss, and red open states saw a net gain. The popularity of the now annual Covid-19 shot is waning year over year, and the data has been adjusted to measure relative popularity, with states with higher overall uptake than average reflecting higher Democratic party support.
Overall, this analysis seeks to combine both longer-term trends and more recent trends in order to estimate the current level of support for the Democratic party. As the model must be trained on data only made available in the months (Covid-Vax) and weeks (absentee ballot requests) leading up to the election, it will be unable to detect any 11th-hour shifts.
As George Box said, "All models are wrong, but some are useful." My hope with this analysis is that it might be useful to detect signals that might not be present in traditional election polling. In addition to prediction (which is mostly for fun), I have included some swing state analysis that I think might shed some light on key shifts that have been happening over the last four years.
Keep reading with a 7-day free trial
Subscribe to Relevant Data to keep reading this post and get 7 days of free access to the full post archives.