Misunderstandings that I heard in response to these tweets has encouraged me to try to explain more clearly the logic of why most eligible voters should abstain from voting.
Think of each vote cast between two candidates as being either +1 or -1, so that the positive candidate wins if the sum of all votes cast is positive; the negative candidate wins otherwise. Abstaining is then a vote of 0. (If the vote sum is zero, the election is a tie.)
Assume that there is one binary quality variable that expresses which of the two candidates is “better for the world”, that these two options are equally likely, that each voter gets one binary clue correlated with that quality, and that voters vote simultaneously. What we should want is to increase the chance that the better candidate wins.
While all else equal, each voter may prefer a higher quality candidate, they need not be otherwise indifferent. So if, based on other considerations, they have a strong enough preference for one of the candidates, such “partisan” voters will pick that candidate regardless of their clue. Thus their vote will not embody any info about candidate quality. They are so focused on other considerations that they won’t help make for a more informed election, at least not via their vote. The other “informed” voters care enough about quality that their vote will depend on their quality clue.
Thus the total vote will be the sum of the partisan votes plus the informed votes. So the sum of the partisan votes will set a threshold that the informed votes must overcome to tip the election. For example, if the partisan sum is -10, then the informed votes must sum to at least 10 to tip the election toward the positive candidate. For our purposes here it won’t matter if there is uncertainty over this sum of partisan votes or not; all that matters is that the partisan sum sets the threshold that informed votes must overcome.
Now in general we expect competing candidates to position themselves in political and policy spaces so that on average the partisan threshold is not too far from zero. After all, it is quite unusual for everyone to be very confident that one side will win. So I will from here on assume a zero threshold, though my analysis will be robust to modest deviations from that.
Assume for now that the clues of the informed voters are statistically independent of each other, given candidate quality. Then with many informed voters the sum of informed votes will approach a normal distribution, and the chance that the positive candidate wins is near the integral of this normal distribution above the partisan threshold.
Thus all that matters from each individual voter is the mean and variance of their vote. Any small correlation between a voter’s clue and quality will create a small positive correlation between quality and their mean vote. Thus their vote will move the mean of the informed votes in the right direction. Because of this, many say that the more voters the better, no matter how poorly informed is each one.
However, each informed voters adds to both the mean and the variance of the total vote, as shown in this diagram:
What matters is the “z-score” of the informed vote, i.e., the mean divided by its standard deviation. The chance that the better candidate wins is increasing in this z-score. So if a voter adds proportionally more to the standard deviation than they add to the mean, they make the final vote less likely to pick the better candidate, even if their individual contribution to the mean is positive.
This is why poorly informed voters who vote can hurt elections, and it is why the relevant standard is your information compared to that of the other voters who don’t abstain. If you are an informed voter who wants to increase the chance that the better candidate wins, then you should abstain if you are not sufficiently well informed compared to the others who will vote.
In a previous post I considered the optimal choice of when to abstain in two extreme cases: when all other informed voters also abstain optimally, and when no one else abstains but this one voter. Realistic cases should be somewhere between these extremes.
To model inequality in how informed are various voters, I chose a power law dependence of clue correlation relative to voter rank. If the power is high, then info levels fall very quickly as you move down in voter rank from the most informed voter. If the power is low, then info levels fall more slowly, and voters far down in rank may still have a lot of info.
I found that for a power less than 1/2, and ten thousand informed voters, everyone should vote in both extreme cases. That is, when info is distributed equally enough, it really does help to average everyone’s clues via their votes. But for a power of 3/4, more than half should abstain even if no one else abstains, and only 6 of them should vote if all informed voters abstained optimally. For a power of 1 then 80% should abstain even if no one else does, and only 2 of them should vote if all abstain optimally. For higher powers, it gets worse.
My best guess is that a power of one is a reasonable guess, as this is a very common power and also near the middle of the distribution of observed powers. Thus even if everyone else votes, for the purpose of making the total vote have a better chance of picking the better candidate, you should abstain unless you are especially well informed, relative to the others who actually vote. And the more unequal you estimate the distribution of who is how informed, the more reluctant you should be to vote.
Many have claimed that it hurts to tell people about this analysis, as low informed voters will ignore it, and only better informed voters might follow it. But this analysis gives advice to each and every voter, advice that doesn’t depend on who else adopts it; every added person who follows this advice is a net win. Yes, people can be uncertain about how unequal is the info distribution, and about where they rank in this distribution. But that’s no excuse for not trying to make best estimates and act accordingly.
Note that the above analysis ignored the cost of getting informed and voting, and that people seem to in general be overconfident when they estimate their informedness rank. Both of these considerations should make you more willing to abstain.
In the above I assumed voter clues are independent, but what if they are correlated? For the same means, clue correlation increases the variance of the sum of individual votes. So all else equal voters with correlated clues should be more willing to abstain, compared to other voters.
Yes, I’ve used binary clues throughout, and you might claim that all this analysis completely changes for non-binary clues. Possible, but that would surprise me.
Added 7a: Re the fact that it is possible and desirable to tell if you are poorly informed, I love this saying:
If you’re playing a poker game and you look around the table and can’t tell who the sucker is, it’s you.
I really like this model challenging the naive assumption that everyone should vote, never thought of it like that!
If I understand correctly it assumes that the partisan votes even out on average. Given that I make the decision not to vote because of relative ignorance under the model, how can I be sure that I am not part of the partisan vote (and therefore would move the average by not voting). Is there a reason why this effect would average out if many people make this decision (it would average out if both sides of the partisan vote followed the model equally likely, but is that realistic)?
If people were really casting their votes to do what is "better for the world", the substantial correlations between political party and gender, race, wealth, educational achievement, geographical factors and the city-rural divide would make little sense.
In practice, voters have different values. They are clearly not all trying to do what is "better for the world". If voters suspect that other voters might have different values from them, then abstaining to give others more of a chance is likely to seem less attractive.