State similarity regression is used to determine the most similar states to a state, to help  get a national mood of the electorate, and not just from that state.  I used three different indexes to regress against each state, partisan lean, region and demographics. 100 score means the states are perfectly identical, while a lower score means they’re less similar. I tried to use Nate Silver’s CANTOR model to help base mine off of, but I changed mine up. For example I did not use k-nearest neighbor algorithm, I did a home made method. Also another large difference was Arizona, Alaska, Hawaii, and New Mexico. I placed Arizona in the south region even though most people put it in the western region, I thought it would suit the results better for the regression. While Arizona was in the South, New Mexico was in the West for the same region. Alaska was put in the Midwest, because it is the region that correlates the most with Alaska, and Hawaii was in the West, not a big surprise. These scores will help regress the model towards a more national mood.

state similarity.png

The lower the number the more representative of the country as a whole, and closer to the median of the electorate.

Screen Shot 2018-12-22 at 9.40.48 PM.png