Bayesian decision theory is often a useful approximation as a theory of decisions, evidence, and learning. And according to it, everything you experience or see or get as an input can be used as data. Some of it may be more informative or useful, but it’s all data; just update via Bayes rule and off you go.
So what then is “scientific” data? Well “science” treated as a social phenomena is broken into many different disciplines and sub-fields, and each field tends to have its own standards for what kinds of data they will publish. These standards vary across fields, and have varied across time, and I can think of no universals that apply to all fields at all times.
For example, at some times in some fields one might be allowed to report on the content of one’s dreams, while in other fields at times that isn’t okay but it is okay to give statistics summarizing the contents of all the dreams of some set of patients at a hospital, while in other fields at other times they just don’t want to hear anything subjective about dreams.
Most field’s restrictions probably make a fair bit of sense for them. Journal space is limited, so even if all data can tell you something, they may judge that certain kinds of data rarely say enough, compared to other available kinds. Which is fine. But the not-published kinds of data are not “unscientific”, though they may temporarily be “un-X” for field X. And you should remember that as most academic fields put a higher priority on being impressive than informative, they may thus neglect unimpressive data sources.
For example, chemists may insist that chemistry experiments know what are the chemicals being tested. But geology papers can give data on tests made on samples obtained from particular locations, without knowing the exact chemical composition of those samples. And they don’t need these samples to be uniformly sampled from the volume of the Earth or the universe; it is often enough to specify where samples came from.
Consider agricultural science field experiments, where they grow different types of crops in different kinds of soil and climate. They usually don’t insist on knowing the exact chemical composition of the soil, or the exact DNA of the crops. But they can at least tell you where they got the crops, where exactly is the test field, how they were watered, weeded, and fertilized, and some simple stats on the soils. It would be silly to insist that such experiments use a “representative” sample of crops, fields, or growing conditions. Should it be uniformly sampled from actual farming conditions used today, from all possible land on Earth’s surface, or from random mass or volume in the universe across its history?
Lab experiments in the human and social sciences today typically use convenience samples of subjects. They post invitations to their local school or community and then accept most everyone who signs up or shows up. They collect a few stats on subjects, but do not even attempt to create “representative” samples of subjects. Nationally, globally-now, or over-all-history representative samples of lab subjects would just be vastly more expensive. Medical experiments are done similarly. They may shoot for balance along a few particular measured dimensions, but on other parameters they take whoever they can get.
I mention all this because over the last few months I’ve had some fun doing Twitter polls. And I’ve consistently had critics tell me I shouldn’t do this, because Twitter polls are “meaningless” or “worthless” or “unscientific”. They tell me I should only collect the sort of data I could publish in a social science journal today, and if I show people any other kind of data I’m an intellectual fraud. As if some kinds of data were “unscientific”.
Today I have ~24,700 followers, and I can typically get roughly a thousand people to answer each poll question. And as my book Elephant in the Brain suggests, I have many basic questions about human behavior that aren’t very specific to particular groups of people; we have many things to learn that apply to most people everywhere at all times. Whenever a question occurs to me, I can take a minute to post it, and within a few hours get some thought-provoking answers.
Yes, the subset of my Twitter followers who actually respond to my polls are not a representative sample of my nation, world, profession, university, or even of Twitter users. But why exactly is it so important to have a representative sample from such a group?
Well there is a big advantage to having many representative polls from the same group, no matter what that group. Then when comparing such polls you have to wonder less whether sample differences are driving results. But the more questions I ask of my Twitter followers, the more I can usefully compare those different polls. For example, if I ask them at different times, I can see how their attitudes change over time. Or if I make slight changes in wording, I can see what difference wording changes make.
Of course if I were collecting data to help a political candidate, I’d want data representative of potential voters in that candidate’s district. But if I’m just trying to understand the basics of human behavior, its not clear why I need any particular distribution over people polled. Yes, answers to each thing I ask might vary greatly over people, and my sample might have few of the groups who act the most differently. But this can happen for any distribution over the people sampled.
Even though the people who do lab experiments on humans usually use convenience samples that are not representative of a larger world, what they do is still science. We just have to keep in mind that differing results might be explained by different sources of subjects. Similarly, the data I get from my Twitter polls can still be useful to a careful intellectual, even if isn’t representative of some larger world.
If one suspects that some specific Twitter poll results of mine differ from other results due to my differing sample, or due to my differing wordings, the obvious checks are to ask the same questions of different samples, or using different wordings. Such as having other people on Twitter post a similar poll to their different pool of followers. Alas, people seem to be willing to spend lots of time complaining about my polls, but are almost never willing to take a few seconds to help check on them in this way.
What seems? .
Non-uniform sampling relative to some metric is not bias!