Tag Archives: Data

Disagreement on Disagreement

I’m seriously considering returning to the topic of disagreement in one of my next two books. So I’ve been reviewing literatures, and I just tried some polls. For example:

These results surprised me. Experience I can understand, but why are IQ and credentials so low, especially relative to conversation style? And why is this so different from the cues that media, academia, and government use to decide who to believe?

To dig further, I expanded my search. I collected 16 indicators, and asked people to pick their top 4 out of these, and also for each to say “if it tends to make you look better than rivals when you disagree.” I had intended this last question to be about if you personally tend to look better by that criteria, but I think most people just read it as asking if that indicator is especially potent in setting your perceived status in the context of a disagreement.

Here are the 16 indicators, sorted by the 2nd column, which gives % who say that indicator is in their top 4. (The average of this top 4 % is almost exactly 5/16, so these are actually stats on the top 5 indicators.)

The top 5 items on this list are all chosen by 55-62% of subjects, a pretty narrow % range, and the next 2 are each chosen by 48%. We thus see quite a wide range of opinion on what are the best indicators to judge who is right in a disagreement. The top 7 of the 16 indicators tried are similarly popular, and for each one 37-52% of subjects did not put it in their personal top 5 indicators. This suggests trying future polls with an even larger sets of candidate indicators, where we may see even wider preference variation.

The most popular indicators here seem quite different from what media, academia, and government use to decide who to believe in the context of disagreements. And if these poll participants were representative and honest about what actually persuades them, then these results suggest that speakers should adopt quite different strategies if their priority is to persuade audiences. Instead of collecting formal credentials, adopting middle-of-road positions, impugning rival motives, and offering long complex arguments, advocates should instead offer bets, adopt rational talking styles and take many tests, such as on IQ, related facts, and rival arguments.

More likely, not only do these poll respondents differ from the general population, they probably aren’t being honest about, or just don’t know, what actually persuades them. We might explore these issues via new wider polls that present vignettes of disagreements, and then ask people to pick sides. (Let me know if you’d like to work on that with me.)

The other 3 columns in the table above show the % who say an indicator gives status, the correlation across subjects between status and top 4 choices, and the number of respondents for each indicator. The overall correlation across indicators between the top 5 and status columns is 0.90. The obvious interpretation of these results is that status is closely related to persuasiveness. Whatever indicators people say persuades them, they also say give status.

GD Star Rating
Tagged as: ,

It’s All Data

Bayesian decision theory is often a useful approximation as a theory of decisions, evidence, and learning. And according to it, everything you experience or see or get as an input can be used as data. Some of it may be more informative or useful, but it’s all data; just update via Bayes rule and off you go.

So what then is “scientific” data? Well “science” treated as a social phenomena is broken into many different disciplines and sub-fields, and each field tends to have its own standards for what kinds of data they will publish. These standards vary across fields, and have varied across time, and I can think of no universals that apply to all fields at all times.

For example, at some times in some fields one might be allowed to report on the content of one’s dreams, while in other fields at times that isn’t okay but it is okay to give statistics summarizing the contents of all the dreams of some set of patients at a hospital, while in other fields at other times they just don’t want to hear anything subjective about dreams.

Most field’s restrictions probably make a fair bit of sense for them. Journal space is limited, so even if all data can tell you something, they may judge that certain kinds of data rarely say enough, compared to other available kinds. Which is fine. But the not-published kinds of data are not “unscientific”, though they may temporarily be “un-X” for field X. And you should remember that as most academic fields put a higher priority on being impressive than informative, they may thus neglect unimpressive data sources.

For example, chemists may insist that chemistry experiments know what are the chemicals being tested. But geology papers can give data on tests made on samples obtained from particular locations, without knowing the exact chemical composition of those samples. And they don’t need these samples to be uniformly sampled from the volume of the Earth or the universe; it is often enough to specify where samples came from.

Consider agricultural science field experiments, where they grow different types of crops in different kinds of soil and climate. They usually don’t insist on knowing the exact chemical composition of the soil, or the exact DNA of the crops. But they can at least tell you where they got the crops, where exactly is the test field, how they were watered, weeded, and fertilized, and some simple stats on the soils. It would be silly to insist that such experiments use a “representative” sample of crops, fields, or growing conditions. Should it be uniformly sampled from actual farming conditions used today, from all possible land on Earth’s surface, or from random mass or volume in the universe across its history?

Lab experiments in the human and social sciences today typically use convenience samples of subjects. They post invitations to their local school or community and then accept most everyone who signs up or shows up. They collect a few stats on subjects, but do not even attempt to create “representative” samples of subjects. Nationally, globally-now, or over-all-history representative samples of lab subjects would just be vastly more expensive. Medical experiments are done similarly. They may shoot for balance along a few particular measured dimensions, but on other parameters they take whoever they can get.

I mention all this because over the last few months I’ve had some fun doing Twitter polls. And I’ve consistently had critics tell me I shouldn’t do this, because Twitter polls are “meaningless” or “worthless” or “unscientific”. They tell me I should only collect the sort of data I could publish in a social science journal today, and if I show people any other kind of data I’m an intellectual fraud. As if some kinds of data were “unscientific”.

Today I have ~24,700 followers, and I can typically get roughly a thousand people to answer each poll question. And as my book Elephant in the Brain suggests, I have many basic questions about human behavior that aren’t very specific to particular groups of people; we have many things to learn that apply to most people everywhere at all times. Whenever a question occurs to me, I can take a minute to post it, and within a few hours get some thought-provoking answers.

Yes, the subset of my Twitter followers who actually respond to my polls are not a representative sample of my nation, world, profession, university, or even of Twitter users. But why exactly is it so important to have a representative sample from such a group?

Well there is a big advantage to having many representative polls from the same group, no matter what that group. Then when comparing such polls you have to wonder less whether sample differences are driving results. But the more questions I ask of my Twitter followers, the more I can usefully compare those different polls. For example, if I ask them at different times, I can see how their attitudes change over time. Or if I make slight changes in wording, I can see what difference wording changes make.

Of course if I were collecting data to help a political candidate, I’d want data representative of potential voters in that candidate’s district. But if I’m just trying to understand the basics of human behavior, its not clear why I need any particular distribution over people polled. Yes, answers to each thing I ask might vary greatly over people, and my sample might have few of the groups who act the most differently. But this can happen for any distribution over the people sampled.

Even though the people who do lab experiments on humans usually use convenience samples that are not representative of a larger world, what they do is still science. We just have to keep in mind that differing results might be explained by different sources of subjects. Similarly, the data I get from my Twitter polls can still be useful to a careful intellectual, even if isn’t representative of some larger world.

If one suspects that some specific Twitter poll results of mine differ from other results due to my differing sample, or due to my differing wordings, the obvious checks are to ask the same questions of different samples, or using different wordings. Such as having other people on Twitter post a similar poll to their different pool of followers. Alas, people seem to be willing to spend lots of time complaining about my polls, but are almost never willing to take a few seconds to help check on them in this way.

GD Star Rating
Tagged as: ,