Thresholds Hurt Oddballs

To estimate the quality of something using a bunch of noisy clues, people often set minimum “deal-breaker” thresholds for each clue, and then reject candidates who fall below any threshold. For example, in dating:

My 20-year-old daughter informed me that she recently dumped a guy because when she asked him the meaning of a word, he said, “Are you serious?”  “That was it. It’s like a huge test for me. … It told me he felt intellectually superior to me,” explained Jenna. … Whoa, give the guy a break, I thought. …

[In] “Love and the Litmus Test,” an article that appeared here 28 years ago [I] essentially justified the kind of subjective, quick and seemingly irrational judgment that Jenna had made. … An “insignificant gesture, an offhand comment” or a plaid sports coat can alter destiny. … For my daughters, ineptitude in the kitchen is almost a deal-breaker. … “The check shouldn’t even hit the table if you’re out to dinner — he should grab it out of the waitress’s hand. … If a guy ever picks up a phone during a meal, I would never talk to him again. …  Irrelevant [facebook] wall posts tells me the guy has too much time on his hands. (more)

Now consider the following two dimensional space, where clues are linear correlates of a linear quality.

thresholdbiasRed lines A,B,C show three different clue cutoffs, and the blue region shows the points that satisfy all three cutoffs. If we consider directions perpendicular to the better vs. worse quality axis, we can see that even though being “odd“, i.e., away from the central quality axis, does not hurt quality, the deal-breaker approach to selecting candidates is biased against odd candidates. Plain, i.e., not odd, candidates are acceptable even when relatively low in quality.

In general, instead of letting each noisy clue be a potential deal-breaker, it is usually better to weigh your clues together (e.g., via a weighted average).

So why do some women claim that they combine clues via deal-breaker thresholds instead of via weighing clues? My guess: such women are bragging about their selectivity in a way that is relatively verifiable. It would be harder to verify a claim that they set a high threshold on a complex weighing of many factors.

The same bias applies to regulations, which typically consist of a set of minimal requirements along many different dimensions. Such regulations are easier to express, monitor, and prosecute, but as with dating they are also biased against odd people and ventures. Count this as another way to see that regulation discourages innovation.

Thanks to Alex for talking this through with me.

GD Star Rating
loading...
Tagged as: , , , ,
Trackback URL:
  • http://manwhoisthursday.blogspot.com Thursday

    Yes, if you are an oddball, you will have a hard time finding a mate, even if you have some very attractive traits. Which is why learning how to play the game by the rules that normal people set is so crucial. There are smart times to break the rules and dumb times to break the rules and it is good to at least know where the lines are, so you can make an informed decision as to when you are going to break the rules.

  • Hrm

    This isn’t a problem with regulation per se, but with inflexible, not-easily-changed regulation. All regulation should follow Holmes: “The Law should be stable but never stand still.”

    As far as the dating thing goes, this could easily indicate something else: lack of selectivity in other areas. For some people, those irrelevant things help filter out candidates where the person doing the rejecting doesn’t have (or refuses to use) meaningful criteria. For people (like myself, for example) who have non-trivial, very specific requirements, we can’t use those trivial things or we’d take our pool from dozens or single digits down to zero. My recommendation to those using trivial filters: set more rigid, non-trivial standards have have the guts to enforce them.

  • cccc

    Type I vs Type 2 error. Robin doesn’t really discuss the benefits of high or low thresholds. When Type 1 error (false positive) is what you want to limit, have high thresholds.

    If Type 2 errors (false negatives) harm you too much, rely on holistic weighting, low thesholds, etc.

  • OhioStater

    Outstanding article. As an aside, girls are more likely to do this than men.

  • http://www.rationalmechanisms.com/lexicon DWCrmcm

    “There even seem to be cultural differences between the two of them: Kristen remembers a time when not everything could be found on the Internet, Jenna does not.”

    There it is.

    Context.

    Dating is a contextual experience.
    Out of context, out of luck.

    Great post Robin

  • gwern

    …Or it’s a cheap frugal heuristic like take-the-best.

    But no, I’m sure it’s all about verifiable social signaling. Obviously!

  • NIck Walker

    Your visual illustration has two inferences:
    1. distance from plain has greater likelihood of slipping below a threshold

    2. distance from plain forces the oddball to overcompensate another quality signal to achieve the outcome as the plain person, such as spending more on clothing, cars, or education.

  • PJ Eby

    I’m not sure why you seem to be implying that this has to do with mating or signaling; it’s the same algorithm we use to do almost any selection process with a lot of candidates. That is, we raise our thresholds as a way of narrowing down the field of choices.

    In other words, this sort of behavior in dating is something we’d expect to see if people simply adjust their thresholds to make choices.

    Sure, there may be signaling benefits to mentioning your high standards, and indeed PUAs recommend guys put out similar signals. However, when you pick what you want to have for dinner at a new restaurant with a bunch of unfamiliar dishes, my guess is that you’re going to start by raising your thresholds to rule out as many items as possible, in order to get to a manageable number of choices.

    Indeed, for this to make sense as signaling, you have to *first* assume that others will understand that more choices equals higher selectivity… otherwise, you’re just signaling that you’re a picky, petty person! Only if people implicitly understand that high thresholds are a natural response to having lots of choices, does the communication have value as a signal.

  • http://hanson.gmu.edu Robin Hanson

    Hrm, the issue here is the form of regulatory rules, not the rate of change.

    gwern, averaging clues is also cheap fast and frugal.

    PJ, I claim we actually usually average clues, and only threshold the average.

    • PJ Eby

      It is improbable that brains would have evolved as you described for processing comparisons where there are too many items to compare in short-term memory. It would be necessary at least to keep track of a currently-most-favored choice and systematically compare each other item to that choice, on all relevant perceptual features. This is extremely implausible machinery to be built by natural selection.

      On the other hand, a simple feedback loop suffices to implement “raise individual evaluation thresholds until only a few choices remain”… and this hypothesis would predict that people with lots of choices might overshoot their threshold and fail to pick *any* option at all. (Which, as we know, happens quite a bit.)

      In contrast, a mechanism that was based on “rationally” computing a weighted average would have no mechanism-derived reason to overshoot or reject making a choice at all: such a mechanism could simply scan all the choices and just drop anything that didn’t come out ahead of the running best choice.

      But it takes mental effort to perform such a computation; the default algorithm of simply raising your (individual) standards is much easier, especially since it can be computed in parallel by the parts of your brain performing the individual evaluations. The measurements that bang on a threshold get automatically brought to the forefront of consciousness, and you rule it out.

      In contrast, your hypothesized method requires the evaluations to be centralized (i.e. probably consciously), and is much more difficult to do in such a way that items can be dropped from evaluation early on.

      In short:parallel raising of thresholds is faster, requires fewer costly cognitive resources, and makes better use of the hardware we run on. Plus, it’s vastly easier to reach in evolution’s design space. The odds are therefore very heavily against your hypothesis being correct for an evolved brain, especially given what we know about contrast and anchoring in perception.

    • http://hanson.gmu.edu Robin Hanson

      PJ, it really doesn’t take much more processing or memory to execute (< c (+ (* w1 x1) (* w2 x2) (* w3 x3) ...) than to execute (and (< c1 x1) (< c2 x2) (< c3 x3) ...).

    • http://www.gwern.net/ gwern

      > gwern, averaging clues is also cheap fast and frugal.

      Recognition is much easier than averaging and multiplication. People are not computers and FLOPS are not how they think, as anyone watching someone ponder prices and taxes at a store would instantly recognize.

      If a weighted linear model is so easy for humans to do, why do we see things like http://lesswrong.com/lw/3gv/statistical_prediction_rules_outperform_expert/ ?

  • http://michaelkenny.blogspot.com Mike Kenny

    This post reminds me of the story about Van Halen demanding only a certain color M and M candy in a bowl, or something, on their rider. If they didn’t have that then I guess they would leave the venue. Possibly telling the story allowed them to brag about selectivity, similar to women with litmus tests.

    IIRC, I read somewhere VH used the M and M thing as a canary in the coal mine, too, since if that was missing then they believed there was a chance other more important stuff was missing from the rider, and perhaps that is what women do with litmus tests too. Weirdness is often bad, isn’t it? A deviation from the robust norm?

    • Unnamed

      Van Halen’s show was extremely technically demanding, requiring the venue to get lots of details right, so they used the M&Ms thing as a simple test of whether the venue followed the detailed specifications in the contract.

  • Psychohistorian

    I think the dating example doesn’t work quite the way you think it does, but it does actually highlight the problem with regulation.

    For the first example, the issue was probably that the guy was arrogant and talked down to her, and he was probably also inconveniently smarter than her. If this trait had not manifest itself at all besides him being surprised when she didn’t know the meaning of the word, and she’d liked the guy otherwise, she’d probably have overlooked it. If she had been really, really into him for other reason, she’d probably have overlooked it. She uses that particular story as way of expressing what was wrong with the guy; she doesn’t use it because it was an act so bad that it made an instant dealbreaker. Dealbreakers are often used only if the guy is on the edge or if they seem symptomatic of a larger problem.

    Of course, regulation and their enforcement miss this depth, and simply get enforced because they are the rules. On the other hand, it’s a lot easier to control how your business does its recordkeeping than it is to control your own personal quirks on a date, and strict rules are far, far, far easier to apply (and litigate) than vague standards.

  • rapscallion

    Your simple illustration seems to imply that all types are uniformly distributed, but this is probably not the case. High quality oddballs are often far less common than high quality “plains,” so weeding them out based on stricter criteria might make more sense.

  • http://www.angryblog.org Brian Moore

    Liz Lemon is disappointed to hear about this post.

  • arch1

    So why do some women claim that they combine clues via deal-breaker thresholds instead of via weighing clues?

    Because some women really do use deal-breaker threshholds, at least some of the time? (e.g. because those are easier to apply)

    Because they don’t fully understand their real algorithm?

    Because they don’t have the time/inclination/ability/access to drawing tools to describe their real algorithm?

    Because they’re really describing context-specific threshholds which get misinterpreted as context-free threshholds?

    Because in some cases quantization renders weighting irrelevant (if the guy’s a convicted murderer, few women would bother weighing this against other factors)?

    • http://www.rationalmechanisms.com/lexicon DWCrmcm

      Because they’re really describing context.

  • blink

    Are you saying that women are lying about applying thresholds or that they do apply them which is suboptimal? For men, what are the behavioral implications if the thresholds are believed?

    If thresholds are common, this could give social norms considerably greater power which may bear on the forager vs. farmer line of reasoning or explain why parents must over-parent in public even when others pay little attention.

    • http://hanson.gmu.edu Robin Hanson

      In order to work as a credible signal, it cannot be too easy to fake using thresholds. So surely that who claim to do so, often actually do so.

  • Michael Kirkland

    Consider the possibility that regulation does include a weighted component. Violations are generally punished with a fine, not a forced shutdown of the corporation, thus breaking them marginally decreases the fitness of the corporation, allowing more profitable corporations to break more regulations than marginal ones.

  • Pingback: The Social Network, the End of Intimacy, and the Birth of Hacker Sensibility | Reviews In Depth