What seems? .

Non-uniform sampling relative to some metric is not bias!

So now that you've been accused of bias you've decided that sampling bias is ok after all? Coming from a blog called "Overcoming Bias" this is almost funny.

Apparently the Hanson way to overcome bias is to just stop caring about it.

Do you have a theory? If not, how can you be confident that science doesn't benefit from its own rules of evidence (which it more or less has)?

[Added.] A couple more thoughts:

1. Can you imagine a fair trial could occur without rules of evidence? It seems to me that the division of labor between judge and jury is along far/near lines. The jury system is effective for determining facts, but only because jurors are largely restricted to near-mode thinking framed by the courts far-mode analysis.

2. You mentioned that you studied the philosophy of science to discover what allowed science to progress, but you were disappointed in not finding the answer. I wonder if a great part of the "secret of science" is not its very high standards for evidence.

That's actually a big puzzle in law, why there are "rules of evidence." The answer isn't simple.

Consider that the law has long known that bad evidence presented to a jury isn't "just data." Why do you suppose it is different for science?

Yes, academia and most journalism tend to enforce progressive norms, while you see enforcement to the right in institutions like the executive branch or Fox News. And norm enforcement biases the published record in favor of the norms being enforced (although I'd say the level of rigor should generally be raised regardless of topic, in light of the low reproducibility of social science).

But you are a university professor, and the pushback you get is mostly from academia and journalism. Given that the norms exist, being more rigorous and careful on the topics that especially concern your community can help you and the discourse.

I have to agree with Carl Shulman. The twitter complaints are not about the scientific merits of the twitter polls.

I think the complaints have to be understood in the context of: this is a poll the petitioner doesn't like therefore the petitioner will try to identify all possible arguments against the polls, and then pick one that sounds right. Arguing against every possible argument the petitioner can come up with is just giving the petitioner leverage to engage in continuously shifting the goal posts. Rather than responding to the exact details of the argument, it would be better to respond to the framing surrounding the argument with an understanding of why someone might reach that conclusion. Why are certain gender issues so difficult to discuss publicly, and is it really the best tactic to try and win out by confronting the topic head on?

Before you condemn all this, I will point out that your argument uses some of the same type of reasoning as the petitioner's. Rather than stating your goals in the twitter polls, and outlining an argument in favor of those goals, you are outlining all possible arguments in favor of your twitter polls. At one point the post points out the benefits of longitudinal analysis within a convenience sample, but at the end of the post you argue that it is possible for a convenience sample to be representative with respect to certain questions. But these two different arguments point to different strategies with respect to what sorts of twitter polls to conduct. In the first situation, you should be repeatedly asking the same types of question. In the second situation, you should be asking questions which can be compared with polls from a more representative sample to better understand the limitations of your particular convenience sample. Certainly both strategies can be utilized, but I suspect your real goal is primarily the latter case (the cross-sectional comparison) rather than the former case (the longitudinal comparison) even though you spent more time defending the former case.

So yes, you're right that there's nothing wrong with convenience sampling. But that's not really the thing that needs to be addressed.

Do you think that a source of polls for a group that leaned left instead of right, such as journalists or university professors, would get an equal pushback and criticism and pressure to stop? If not, then if people who lean right silence their polls, the net effect is to promote left views.

On the biased data side:

In politics there is demand for accurate polls to guide decision-making by politicians, and for polls biased in one's favor, since the latter tend to embolden one's supporters to vote. More broadly (as you have written about extensively), in contests over norms, information about current norm endorsement has a causal effect on people's readiness to act on those norms.

If your Twitter followers are less supportive of contested norms, or especially below majority support, then that will be perceived as threatening coordination to oppose the norms.

Also, as you frequently note, distant observers are an important audience in signalling, observers who will have less context, and some of whom will engage in biased amplification of data. A retweeted image of a poll result can thus have effects similar to p-hacked studies, which on net tend to distort meta-analysis, media coverage, and policy, even though an ideal Bayesian with a perfect model of p-hacking and knowledge of exactly which studies were p-hacked could benefit from them.

On it being offensive to ask a question or an audience:

It would likely be less offensive if you had responses broken down by gender and political views, since it wouldn't be at risk of the above confusion between your biased sample and a less biased sample, or an approved subsample.

You could also avoid offending those who objected to asking the question on Twitter (in such a way that the poll results automatically get social media coverage even in a preliminary stage), but would have less objection to a more careful and developed publication.

But it's true that using an MTurk sample (and posting the question and results) would not eliminate all criticism, e.g. about the signals about your views indicated by your asking the question with your wording.

This graph says the number is more like 100,000: https://www.reddit.com/r/da...

Edit: actually that graph has a max rank of 200,000, suggesting it is of a sample of that size. As Twitter has 336M total, this suggests my actual rank is 1000 larger, at around 100 million. But that seems wrong too.

Expand full comment

Few respond in that way because there are fewer than 7000 with that number of twitter followers, and they probably don't follow you.

For sure, data is just data. There is nothing wrong with more data. The "unscientific" complaint, is when you attempt to draw more generalized conclusions from the data that you acquired. The complaint, if you haven't taken careful steps to get a representative or random sample, is that it is far too easy for your data to be biased, and thus for your attempted generic conclusions about "all or most humans" to be likely wrong. (Especially when they are surprising, or contradict intuition developed through other experiences.)

This is what "science" is about. It's a long history of smart, well-intentioned people coming to the wrong conclusions about the world. It's just far, far too easy for a motivated brain to fool itself into justifying the conclusions that it wants for emotional reasons -- without even being aware of that bias!

So, for the "gold standard" of science, we insist on: millions or billions of experimental results (physics). Or "blind" studies with control groups (medicine), because even a well-intentioned researcher can't avoid infecting the results with his bias. Or better yet "double-blind" studies (medicine again), because even mere lab techs can't avoid slightly different (but significant) behavior, if they know which group the patient is part of.

It isn't that data without a double-blind study is somehow not "scientific evidence". Instead, it is: if you've gathered data without double-blind controls, then we know many examples in the past where the initial data looked to suggest a compelling conclusion, but that conclusion actually turned out to be wrong.

It is very, very difficult to properly account for the weight of evidence that is gathered via a more informal process. It is the "so what?" part, that people are complaining about. Worse: this is all assuming that you actually ARE upstanding, and have good intentions for seeking truth. If instead you are a biased manipulator and persuader, then avoiding the strictest requirements of science is exactly the tool you use in order to construct a false but persuasive argument.

In this context, how is a reader to interpret the Hanson twitter polls? What reliable conclusions can possibly be derived from that data, which is more likely to be true, than not? That's not an easy question to answer.

Part of it is status outrage. Science is impressive. Twitter polls are unimpressive. Hence what you did can't be science. How dare you claim to have obtained knowledge in an unimpressive way?

Part of it is taboo violation. Twitter is seen as something of a social space. Asking your acquaintances subtly different and leading questions in order to scrutinise and expose their beliefs, hypocrisies and confusions is not the act of a trustworthy ally. Particularly when it comes to subjects where we are especially likely to want to be hypocritical - such as sex, power, and fashionable political movements.

Wearing a white coat, behaving in a stylised manner, etc, are distancing rituals necessary to demarcate zones of knowledge-seeking and expertise from everyday life, where knowledge-seeking and expertise are somewhat unwelcome. You are crossing the streams.

If my followers tend to have a gender or political slant, then the results of polling them will also have that slant, we can agree. But once we all know that fact, the results of such polls are still valid data, and only "controversial" or "offensive" if it is intrinsically controversial or offensive to hear of the opinions from a group with that gender/political slant. If it is the querstions I ask that are instead intrinsically controversial or offensive, then it won't matter if I ask them to a different group.

Expand full comment

You get criticism for politically controversial polls, especially ones that suggest opposition to initiatives such as #MeToo. As your own writing suggests, choice of polling question can be a probabilistic signal about your political beliefs (eliciting criticism if the signal suggests troubling ones), and biased poll results can have political effects.

People reasonably think your poll respondents are more male and ideologically similar to you than a more representative sample, and you pattern-match to right-of-center in this area. So the non-representativeness is very relevant to these polls, and in a way that amplifies the political connotation that people are complaining about.

That others don't respond with alternative polls (especially those who object to the poll in the first place) doesn't mean you shouldn't use MTurk or a similar service on such topics where your Twitter follower convenience sample is biased in a fashion known to be relevant. They are usually more expensive than Twitter, but more informative and probably less expensive overall considering the costs of offending people. Those platforms also provide more control over viral spread of results of the biased sample polls. And if your initial polls were less provocative, others might be more willing to do more exploration of the results.

Even just using a more representative non-Twitter sample to supplement when people are concerned about a Twitter poll would seem to be better than your current practice. You can take on that task yourself rather than pushing it on others who object to tweeting such polls for the same reasons they find it troubling when you post them.

