Arresting irrational information cascades

Usually people don’t agree with one another as much as they should. Aumann’s Agreement Theorem (AAT) finds:

two people acting rationally (in a certain precise sense) and with common knowledge of each other’s beliefs cannot agree to disagree. More specifically, if two people are genuine Bayesian rationalists with common priors, and if they each have common knowledge of their individual posteriors, then their posteriors must be equal.[1]

The surprising part of the theorem isn’t that people should agree once they have heard the rationale for each of their positions and deliberated on who is right. The amazing thing is that their positions should converge, even if they don’t know how the other person reached their conclusion. Robin has reached a similar result using even weaker assumptions.

If we sincerely applied this finding in real life, it would go a long way to correcting the confirmation biases which makes us unwilling to adjust our positions in response to new information. But having a whole community take this theorem out of the lab and into real life is problematic, because using it in an imperfect and human way will leave them vulnerable to ‘information cascades’ (HT to Geoff Anders for the observation):

An information (or informationalcascade occurs when people observe the actions of others and then make the same choice that the others have made, independently of their own private information signals. A cascade develops, then, when people “abandon their own information in favor of inferences based on earlier people’s actions”.[1] Information cascades provide an explanation for how such situations can occur, how likely they are to cascade incorrect information or actions, how such behavior may arise and desist rapidly, and how effective attempts to originate a cascade tend to be under different conditions

There are four key conditions in an information cascade model:

  1. Agents make decisions sequentially
  2. Agents make decisions rationally based on the information they have
  3. Agents do not have access to the private information of others
  4. A limited action space exists (e.g. an adopt/reject decision).[3]

This is a fancy term for something we are all familiar with – ideas can build their own momentum as they move through a social group. If you observe your friends’ decisions or opinions and think they can help inform yours, but you aren’t motivated to double check up on their evidence, then you might simply free-ride by copying them. Unfortunately, if everyone copies in this way, we can all end up doing something foolish, so long as the first few people can be convinced to trigger the cascade. A silly or mischievous tail can end up wagging an entire dog. As a result, producing useful original research for social groups, for instance about which movies or restaurants are best, is a ‘public good’ which we reward with social status.

Now, nobody lives by agreement theorems in real life. We are biased towards thinking that when other people disagree with us, they are probably wrong – in part because copying others makes us seem submissive or less informed. Despite this, information cascades still seem to occur all over the place.

How much worse will this be for a group that seriously thinks that every rational person should automatically copy every other rational person, and  makes a virtue of not wasting the effort to confirm the data and reasoning the ultimately underlie their views? This is a bastardisation of any real agreement theorem, in which both sides should adjust their view a bit, which I expect will prevent a cascade from occurring. But mutual updating is hard and unnatural. Simply ‘copying’ the higher status members of the group is how humans are likely to end up agreeing in practice.

Imagine: Person A – a significant member of the community – comes to the group and expresses a casual opinion based on only a little bit of information. Person B listens to this, has no information of their own, and so automatically adopts A belief, without probing their evidence or confidence level. Person C hears that both A and B believe something, respects them both as rational Bayesians, so adopts their position by default. A hears that C has expressed the same opinion, thinks this represents an independent confirmation of the view, and as a result of this ‘pseudo-replication’, becomes more confident. And so the cycle grows until everyone holds a baseless view.

I can think of a few ways to try to arrest such a cascade.

Firstly, you can try to apply agreement theorems more faithfully, by ensuring two people discussing their view update both up and down, rather than just copying. I am skeptical that this will happen.

Secondly, you could stop the first few people from forming incorrect opinions, or sharing conclusions without being quite confident they are correct. That is difficult, stop prevents you aggregating tentative evidence, and also increases the credibility of any remaining views that are expressed.

Thirdly, you could take theorems with a grain of salt, and make sure at least some people in a group refuse to update their beliefs without looking into the weight of evidence that backs them up, and sounding the alarm for everyone else if it is weak. In a community where most folks are automatically copying one another’s beliefs, doing this work – just in case someone has made a mistake or is not the ‘rational Bayesian’ you thought they were – has big positive externalities for everyone else.

Fourthly, if you are part of a group of people trying to follow AAT, you could all unlearn the natural habit of being more confident about an idea just because many people express it. In such a situation, it’s entirely possible that they are all relying on the same evidence, which could be little more than hearsay.

GD Star Rating
loading...
Trackback URL:
  • Daniel Yokomizo

    Person A and B should converge, so A’s beliefs need to be updated by B’s beliefs too. In this scenario A need to consider, at least, what is the probability that B wouldn’t know about the information beforehand (if it’s highly probable that if the information is correct B would have known about it, then the probability it’s correct lowers). Both also need to consider the probability they believe each other to be a good assessor of that information (e.g. if it’s under A’s expertise and B’s has a high probability on A being an expert on that area then he should update upwards, otherwise the update is less significant, if B’s believes he’s an expert on that area he should consider what’s the probability that a non-expert like A would have heard of it first), and so on. If you’re blindly copying what others say you’re not being Bayesian, you need to condition the copying.

    • Robert Wiblin

      That’s right – the cascade requires us to be updating imperfectly. That seems to be likely in practice.

      I haven’t fully thought through what updating behaviour would lead to such cascades, and would love to hear about anyone who has looked into the issue properly.

    • dmytryl

      Even more importantly, the evidence that A believes in X may likely not be statistically independent from the reasons by which B believes (or doesn’t believe) in X . Worse still, the reasoning itself may have been very much non Bayesian simply due to not knowing how to do ‘updates’ correctly when there’s cycles and loops.

      TBH, I think a lot of ‘disagreement’ can be understood by treating the expression of beliefs as something people often create for their own self interest. You would need to look at actual trade-off that people make to deduce their actionable beliefs, whenever you can. E.g. someone who says he believes in very high importance of X but internally does not believe in X would likely resolve some of the tradeoffs between X and rather unimportant things in favour of unimportant things.

      For simple example: when one claims there is a million dollar diamond in an enclosed box, but you see that person put the box at risk of destruction of 10% for 10$, you can deduce that this person doesn’t quite believe there’s a million dollar diamond in that box. (Assuming some rationality).

  • http://overcomingbias.com RobinHanson

    Yes social pressure to appear to agree among people who irrationally disagree gives different outcomes from rational agreement. Yes in situations where people sequentially make choices from small choice sets, rational agreement may not let late choosers infer the signals of early choosers. The simplest solution is to expand the visible action space. Have people make visible complementary actions, talk about their confidence in their choices, or make bets on later related outcomes.

  • Ben Klemens

    I think assumption 3 (can’t inspect private information) is the most commonly broken among the ones you list.The stylized story is usually about people looking through restaurant windows to see who is sitting at two restaurants. The people deciding where to eat don’t know which diners are sitting at a restaurant because they love it, and which are there because they just saw other people eating there.

    For many situations, especially those involving people in our social circle, we can actually ask them why they have a belief, and real humans really do say things like `Krugman said so, and I haven’t thought about it beyond that’. Once I know that all the agents I’m interviewing have weak private beliefs that derived from one person’s data, I can’t fall victim to the cascade, but will instead just update my beliefs with that one data point.

    [Self-promotional footnote for those into herding models: I have a paper on variants of this model, though it’s more about a variant utility function than variants of the information setup. See http://ben.klemens.org/klemens-kurtosis.pdf ]

    • http://entitledtoanopinion.wordpress.com TGGP

      For those into herding models, Noah Smith is using them in support of the claim that math is good for economics in the comments here.

    • Robert Wiblin

      Yes if people check the original source, which I think they should do more often, then it will limit the potential for cascades. But it often seems not to happen.

  • MichaelJohnson

    Back in 1983, Robert Bordley wrote a paper on the group polarization effect–the tendency of groups to take more extreme positions after discussion than the members held individually before discussion. People’s positions do tend to converge, but not to the middle; they usually converge to one extreme or the other. And this occurs without any of the restrictive conditions in the information cascade model. Bordley used a Bayesian model to show why groups become more polarized.

    Bordley, R. F. (1983). A Bayesian model of group polarization. Organizational Behavior & Human Performance, 32(2), 262-274.

  • Siddharth

    Jostling for status might help reducing the rapidity of the information cascade. If B is a challenger to A’s high-status, it is in his/her benefit to question A’s statement at face value.

  • http://juridicalcoherence.blogspot.com/ Stephen R. Diamond

    Properly applied, Aumann’s Theorem helps guard against rather than promotes information cascades. The theorem formalizes the insight that in making judgments, we each function as a measuring instrument. This means we must take account of other “measuring instruments” as well as one’s own “measurements”; that we should value these opinions to the extent the “instruments” and their “measurements” are high quality. The express concern it affords with the epistemic position of opinion holders can serve against blind adoption based on the number of a view’s proponents.

    Information cascades lower the credibility of the human “instruments” to the extent given persons are subject to them. The risk in using information cascades to limit the force of Aumann’s Theorem is that of ignoring that an information cascade is nothing more or less than a bias, and it should not be employed one-sidedly against opponent positions–that is, without considering the biases you are subject to. I deal with this in more detail in Is Epistemic Equality a Fiction?http://tinyurl.com/6kamrjs

  • ryancarey

    Good article Rob. A few liberties seem to have been taken with interpretation of AAT which I will seek to clarify for readers seeking greater rigor –

    1. There is no such thing as a perfect Bayesian.2. AAT applies to perfect Bayesians. AAT is a mathematical theorem than cannot be ‘accepted’ or ‘rejected’ or ‘taken with a grain of salt’. What we’re actually appraising is an approximation of AAT – when two people affiliate as rationalists, they should update toward each others’ views. This proposition, which we could call ‘AAT-a’ is not proven, and is controversial, and should be taken with a large dose of salt considering that affiliating as a rationalist far from makes you a perfect rational Bayesian.2. AAT doesn’t say that people should agree with eachother more. AAT-a says that, and even then, it only says it about rationalists.3. This oft repeated aspect of the AAT is often misinterpreted “The amazing thing is that their positions should converge, even without knowing how the other reached their conclusion.” Well, that’s only very roughly true. More precisely, it states that their positions should converge, even without knowing how the other reached their conclusion EXCEPT for knowing that it was done by Bayesian reasoning. You do have to know that they reached their conclusion using perfectly Bayesian reasoning.

    • Robert Wiblin

      Thanks Ryan.

  • dmytryl

    On the ‘Bayesian’ in general, I would like to point out that belief propagation in general (graphs with loops and cycles and bi directional updates) is NP-complete and the updating algorithms are very data heavy – you can’t represent probability with a single real number as you have to avoid cyclic updates. This puts a lot of limitation on the applicability of such maxims.

    There’s a bit of Bayesianism Dunning-Kruger effect, IMO – if someone describes their general belief updating – which deals with graphs that have loops and cycles – as Bayesian, that person does not know the relevant mathematics sufficiently well and has a very invalid mental model of the belief updating – something akin to a graph where real numbers propagate – and this only works correctly for a tree, for anything with loops or cycles it gets much much hairier. I would recommend any self proclaimed Bayesian to write a belief propagation program that can handle arbitrary graphs, and work on it until it produces correct results, to get the appreciation for importance of subtleties or for the extent to which getting the subtleties even slightly wrong leads to completely wrong results. It seems that some people expect that the algorithm that seems more superficially correct would give results that are less wrong, but that’s not how incorrect algorithms work.

    • http://www.mccaughan.org.uk/g/ gjm

      Someone who describes their belief updating as Bayesian may mean only that they try to represent their degrees of credence numerically and update them in a manner consistent with Bayes’ theorem. That doesn’t commit them to any particular algorithm for trying to achieve this. In particular, it doesn’t commit them to doing sum-product message passing and pretending every graph is a tree.

      Using more sophisticated algorithms doesn’t mean not representing probability as a real number, even if those algorithms attach a bunch of other numbers to each proposition.

      I like your proposed exercise, though.

      • dmytryl

        may mean only that they try to represent their degrees of credence
        numerically and update them in a manner consistent with Bayes’ theorem. That doesn’t commit them to any particular algorithm for trying to achieve this.

        The NP completeness is not a property of some particular algorithm, you can convert any NP complete problems to belief graphs and the correct solution to that graph (one everywhere consistent with Bayes theorem and axioms) would give solution to that other problem.

        Other issue is that complexity grows fast enough in practice as to make attaining accuracy (or doing anything useful at all) be a contest between heuristics that can often be about as remote from Bayes theorem as computer vision is from Maxwell’s equations.

        Even worse issue is that the graphs are partial, meaning in a case where there’s valid inference raising probability of proposition, and valid inference lowering probability of proposition (or even a subtle relation such that those perfectly balance out), it may be that only one is present in the graph – a big issue when selfish agents inject nodes into your graph.

        The (self labelled) Bayesians say, “we can measure epistemic rationality by comparing the rules of logic and probability theory to the way that a person actually updates their beliefs.”, while in actuality the “rules” are a
        set of relational constraints that is very non-trivial to meet accurately (especially when you only got a part of the graph), rather than a way of updating values. You can’t even check how accurately constraints are conformed to because you only have part of the graph and you want to approximate values of the hypothetical whole graph of all valid inferences. What you can actually do is try to infer various testable belies about the world and test them.

      • http://www.mccaughan.org.uk/g/ gjm

         > The NP completeness is not a property of some particular algorithm

        Of course it isn’t. (It couldn’t be; that’s not what NP-completeness means.) I’m not sure what I said that gave the impression that I think otherwise. (For the avoidance of doubt: I do not think that declaring oneself to be a “Bayesian” means claiming to have an efficient algorithm for doing perfectly accurate probability updates in difficult cases.)

        I hope it isn’t news to anyone who calls themselves Bayesian that it’s possible to be misled by having only part of the evidence. Isn’t that absolutely commonplace and obvious? Maybe not; I tend to overestimate how much is commonplace and obvious. But it certainly isn’t part of what I mean by “Bayesian”, or part of what anyone else seems to mean by it, that one always has all the relevant evidence.

        What I mean by calling someone a “Bayesian” is roughly this: (1) They find the language and techniques of probability theory appropriate for talking about beliefs and inferences. (2) They hold that, ideally, beliefs should be updated consistently with Bayes’ theorem. (3) In cases where it’s clear roughly what that actually means in practice, they attempt to adjust their beliefs accordingly. (For instance, there really are plenty of situations where the naive arithmetic is pretty much exactly what you need.)

        All of that is consistent with being terribly naive and thinking that you’re guaranteed to be reasoning well if you do a bit of arithmetic, or with being very sophisticated and knowing a whole lot about machine learning and graphical models and being extremely cautious about applying any simple belief-updating algorithm. Even the first of those — though of course it has serious pathologies — seems to me to be preferable to simply having no idea that good reasoning has anything to do with mathematics. I would guess (though I have no statistics and getting useful ones would be very hard) that people who describe themselves as “Bayesians” are in general better reasoners and hold more accurate beliefs than those who don’t. It might be that the very most expert avoid that label for fear of being thought to endorse an over-naive version; again I have no statistics nor really any anecdotal evidence; how about you?

      • dmytryl

         > For the avoidance of doubt: I do not think that declaring oneself to be a
        “Bayesian” means claiming to have an efficient algorithm for doing
        perfectly accurate probability updates in difficult cases.

        No, of course not. The issue is that there are silly expectations such as assuming Agreement Theorem would be applicable, or roughly applicable as well as other cases which would require one to have more efficient and more accurate algorithm than is at all plausible (of which the one expecting is likely simply unaware).

        > I hope it isn’t news to anyone who calls themselves Bayesian that it’s
        possible to be misled by having only part of the evidence.

        Well, people tend to maintain some sort of equilibrium, if they expect to be more correct by being Bayesian they relax their rules on not trusting partial evidence or partial inference, or even deem such practical rules not Bayesian. It seems to me that their general idea is that you should ‘update’ more, including precisely the cases where you probably ought to ‘update’ less.

        Regardless of whenever self described Bayesians are better than population as whole (or than IQ-matched controls), the topic is people who are expecting that Agreement Theorem should hold better, to which as an explanation I propose naivete.

        Furthermore, relatively high optimism with regards to the capabilities of an AI and low estimates of the computational power required for intelligence seems to collaborate the naivete hypothesis.

      • http://www.mccaughan.org.uk/g/ gjm

        Yeah, I agree that interpreting AAT as anything like “rational people should always agree with one another” is indicative of serious naivete, and the other things you mention are certainly failure modes for people who call themselves Bayesians. I remain unconvinced that using the term is actually a sign of ignorance or foolishness, but we probably aren’t going to be able to resolve that one.

  • Michael Wengler

    How much real disagreement is there among rational people?  Most of the disagreement I am aware of are in politics.  But there disagreement is not in violation of AAT.  People have different values, and even where values are similar, weightings are different.  So all may value mutual help with support or education and all may value economic autonomy, but some will value one more than the other and hence support either cutting or raising government support of education or welfare.  

    How much real disagreement is there among physicists or chemists or historians or even economicsts?  Drastically less than there is agreement.  Sure, on the margin it looks like economists differ on something like: does Keynsian stimulus do what Keynsians think, but most of that disagreement is values: the level of proof or evidence required differes depending on how you value the need for help compared to the need for economic autonomy.  I’d also submit it is hard to remove value judgements from discussions of debt in fiscal policy.  

    So how much disagreement is there really when differing value assumptions are removed from the discussion?  

    I am presuming that even if you are a moral realist, you recognize that conclusions about which are the moral values that are real are probably not covered by AAT.  

  • Michael Blume

    Nitpick: They don’t just have to both be rational, they have to have common knowledge of both being rational, which seems like a *much* higher bar.

    • Tim Tyler

       …and instrumental rationality isn’t enough.  They need to be truth-seekers – and truth spreaders.  That’s a very weird and unbiological category of folk.

  • http://thomblake.mp thom blake

    I don’t understand the relevance of the youtube link.

    • http://www.mccaughan.org.uk/g/ gjm

      I think Robert is suggesting that the popularity of “Gangnam Style” is best understood as the result of an information cascade rather than as lots of people separately responding to the merits of the work.

  • Epiphany

    I think the solution to this is to create an argument map that is wiki-like in that anyone can edit it, meant for every topic under the sun. It could do things like presenting the conclusion most likely to be true and an outline of the arguments involved so people can get a feel of the complexity or choose to drill down.  Then, instead of just agreeing or disagreeing, and instead of JUST exchanging information with each other, we could go view the argument map.  If we disagree with it, we can update the map – everyone will then be able to update themselves (which is so many thousands of times more efficient than convincing one another to update one-on-one) and we will all have the benefit of getting information from the whole community.

    I am a web developer who would be interested in assisting with such a project (either selecting, editing or creating an open source argument map for this), assuming that other people want to get involved.  If you want to do this, PM me – Epiphany on LessWrong.

  • http://www.yboris.com Boris

    What about expressing not just your degree of belief in a claim but also your source(s). If you say “I didn’t know what to think but authority X said he was pretty sure” will then be passed down as “I didn’t know but my friend said he heard authority X said he was pretty sure” … soon enough you have a chain of “a friend of a friend of a friend said that he heard …” thereby undermining the extent to which the person who hears it updates their belief.