Decision Theory Remains Neglected

Feb 02, 2020

Back in ’84, when I first started to work at Lockheed Missiles & Space Company, I recall a manager complaining that their US government customer would not accept using decision theory to estimate the optimal thickness of missile walls; they insisted instead on using a crude heuristic expressed in terms of standard deviations of noise. Complex decision theory methods were okay to use for more detailed choices, but not for the biggest ones.

In his excellent 2010 book How to Measure Anything, Douglas W. Hubbard reports that this pattern is common:

Many organizations employ fairly sophisticated risk analysis methods on particular problems; … But those very same organizations do not routinely apply those same sophisticated risk analysis methods to much bigger decisions with more uncertainty and more potential for loss. …
If an organization uses quantitative risk analysis at all, it is usually for routine operational decisions. The largest, most risky decisions get the least amount of proper risk analysis. … Almost all of the most sophisticated risk analysis is applied to less risky operational decisions while the riskiest decisions—mergers, IT portfolios, big research and development initiatives, and the like—receive virtually none.

In fact, while standard decision theory has long been extremely well understood and accepted by academics, most orgs find a wide array of excuses to avoid using it to make key decisions:

For many decision makers, it is simply a habit to default to labeling something as intangible [=unmeasurable] … committees were categorically rejecting any investment where the benefits were “soft.” … In some cases decision makers effectively treat this alleged intangible as a “must have” … I have known managers who simply presume the superiority of their intuition over any quantitative model …
What they seem to take away from these experiences is that to use the methods from statistics one needs a lot of data, that the precise equations don’t deal with messy real-world decisions where we don’t have all of the data, or that one needs a PhD in statistics to use any statistics at all. … I have at times heard that “more advanced” measurements like controlled experiments should be avoided because upper management won’t understand them. … they opt not to engage in a smaller study—even though the costs might be very reasonable—because such a study would have more error than a larger one. …
Measurements can even be perceived as “dehumanizing” an issue. There is often a sense of righteous indignation when someone attempts to measure touchy topics, such as the value of an endangered species or even a human life. … has spent much time refuting objections he encounters—like the alleged “ethical” concerns of “treating a patient like a number” or that statistics aren’t “holistic” enough or the belief that their years of experience are preferable to simple statistical abstractions. … I’ve heard the same objections—sometimes word-for-word—from some managers and policy makers. …
There is a tendency among professionals in every field to perceive their field as unique in terms of the burden of uncertainty. The conversation generally goes something like this: “Unlike other industries, in our industry every problem is unique and unpredictable,” or “Problems in my field have too many factors to allow for quantification,” and so on. …
Resistance to valuing a human life may be part of a fear of numbers in general. Perhaps for these people, a show of righteous indignation is part of a defense mechanism. Perhaps they feel their “innumeracy” doesn’t matter as much if quantification itself is unimportant, or even offensive, especially on issues like these.

Apparently most for-profit firms could make substantially more profits if only they’d use simple decision theory to analyze key decisions. Execs’ usual excuse is that key parameters are unmeasurable, but Hubbard argues convincingly that this is just not true. He suggests that execs seek to excuse poor math abilities, but that seems implausible as an explanation to me.

I say that their motives are more political: execs and their allies gain more by using other more flexible decision making frameworks for key decisions, frameworks with more wiggle room to help them justify whatever decision happens to favor them politically. Decision theory, in contrast, threatens to more strongly recommend a particular hard-to-predict decision in each case. As execs gain when the orgs under them are more efficient, they don’t mind decision theory being used down there. But they don’t want it up at their level and above, for decisions that say if they and their allies win or lose.

I think I saw the same sort of effect when trying to get firms to consider prediction markets; those were okay for small decisions, but for big ones they preferred estimates made by more flexible methods. This overall view is, I think, also strongly supported by the excellent book Moral Mazes by Robert Jackall, which goes into great detail on the many ways that execs play political games while pretending to promote overall org efficiency.

If I ever did a book on The Elephant At The Office: Hidden Motives At Work, this would be a chapter.

Below the fold are many quotes from How to Measure Anything:

the word “intangible” has also come to mean utterly immeasurable in any way at all, directly or indirectly. It is in this context that I argue that intangibles do not exist—or, at the very least, could have no bearing on practical decisions. …

For many decision makers, it is simply a habit to default to labeling something as intangible …

committees were categorically rejecting any investment where the benefits were “soft.” …

major investments were approved with no plans for measuring their effectiveness after they were implemented. …

In some cases decision makers effectively treat this alleged intangible as a “must have” so that the question of the degree to which the intangible matters is never considered in a rational, quantitative way. …

I have known managers who simply presume the superiority of their intuition over any quantitative model …

Computing and using the economic value of measurements to guide the measurement process is, at a minimum, where a lot of business measurement methods fall short. …

What they seem to take away from these experiences is that to use the methods from statistics one needs a lot of data, that the precise equations don’t deal with messy real-world decisions where we don’t have all of the data, or that one needs a PhD in statistics to use any statistics at all. …

I have at times heard that “more advanced” measurements like controlled experiments should be avoided because upper management won’t understand them. …

they opt not to engage in a smaller study—even though the costs might be very reasonable—because such a study would have more error than a larger one. …

Usually things that seem immeasurable in business reveal themselves to much simpler methods of observation, once we learn to see through the illusion of immeasurability. …

The clarification chain is just a short series of connections that should bring us from thinking of something as an intangible to thinking of it as tangible. First, we recognize that if X is something that we care about, then X, by definition, must be detectable in some way.… if this thing is detectable, then it must be detectable in some amount. If you can observe a thing at all, you can observe more of it or less of it. …

I ask who thinks the sample is “statistically significant.” Those who remember something about that idea seem only to remember that it creates some kind of difficult threshold that makes meager amounts of data useless …

“If you don’t know what to measure, measure anyway. You’ll learn what to measure.”… the objection “A method doesn’t exist to measure this thing” is never valid. …

measurements can even be perceived as “dehumanizing” an issue. There is often a sense of righteous indignation when someone attempts to measure touchy topics, such as the value of an endangered species or even a human life, …

Meehl … has spent much time refuting objections he encounters—like the alleged “ethical” concerns of “treating a patient like a number” or that statistics aren’t “holistic” enough or the belief that their years of experience are preferable to simple statistical abstractions. … I’ve heard the same objections—sometimes word-for-word—from some managers and policy makers. …

Four Useful Measurement Assumptions: It’s been measured before. You have far more data than you think. You need far less data than you think. Useful, new observations are more accessible than you think. …

I’ve noticed that there is a tendency among professionals in every field to perceive their field as unique in terms of the burden of uncertainty. The conversation generally goes something like this: “Unlike other industries, in our industry every problem is unique and unpredictable,” or “Problems in my field have too many factors to allow for quantification,” and so on. I’ve done work in lots of different fields, and some individuals in most of these fields make these same claims. So far, each one of them has turned out to have fairly standard measurement problems not unlike those in other fields. …

When managers think about measuring productivity, performance, quality, risk, or customer satisfaction, it strikes me as surprisingly rare that the first place they start is looking for existing research on the topic. …

When I asked bank managers what decisions these reports supported, they could identify only a few cases where the elective reports had, or ever could, change a decision. Perhaps not surprisingly, the same reports that could not be tied to real management decisions were rarely even read. …

The data on the dashboard was usually not selected with specific decisions in mind based on specific conditions for action. …

So the question is never whether a decision can be modeled or even whether it can be modeled quantitatively. …

Even just pretending to bet money significantly improves a person’s ability to assess odds. In fact, actually betting money turns out to be only slightly better than pretending to bet. …

Why is it that about 5% of people are apparently unable to improve at all in calibration training? Whatever the reason, it often turns out not to be that relevant. Virtually every single person we ever relied on for actual estimates was in the first two groups and almost all were in the first ideally calibrated group. Those who seemed to resist any attempt at calibration were, even before the testing, almost never considered to be the relevant expert or decision maker for a particular problem. …

there is apparently a strong placebo effect in many decision analysis and risk analysis methods. Managers need to start to be able to tell the difference between feeling better about decisions and actually having better track records over time. …

Many organizations employ fairly sophisticated risk analysis methods on particular problems; … But those very same organizations do not routinely apply those same sophisticated risk analysis methods to much bigger decisions with more uncertainty and more potential for loss. …

If an organization uses quantitative risk analysis at all, it is usually for routine operational decisions. The largest, most risky decisions get the least amount of proper risk analysis. …

Almost all of the most sophisticated risk analysis is applied to less risky operational decisions while the riskiest decisions—mergers, IT portfolios, big research and development initiatives, and the like—receive virtually none…

When I ran the macro that computed the value of information for each of these variables, I began to see this pattern: The vast majority of variables in almost all models had an information value of zero. That is, the current level of uncertainty about most variables was acceptable, and no further measurement was justified. The variables that had high information values were routinely those that the client never measured. In fact, the high-value variables often were completely absent from previous business cases. (They excluded chance of project cancellation or the risk of low user adoption.) The variables that clients used to spend the most time measuring were usually those with a very low (even zero) information value (i.e., it was highly unlikely that additional measurements of the variable would have any effect on decisions). …

At the time of this writing, however, I’ve applied this same test to more than 60 additional projects and I found out that this effect is not limited to IT. I noticed the same phenomena arise in projects relating to research and development, military logistics, the environment, venture capital, facilities expansion, and the CGIAR sustainable farming model. …

First people measure what they know how to measure or what they believe is easy to measure.… the things you measured the most in the past have less uncertainty, and therefore less information value, when you need to estimate them for future decisions.…

Managers might tend to measure things that are more likely to produce good news. After all, why measure the benefits if you have a suspicion there might not be any?…

if you aren’t computing the value of a measurement, you are very likely measuring some things that are of little or no value and ignoring some high-value items.…

The 80 or more major risk/return analyses I’ve done in the past 20 years consisted of a total of over 7,000 individual variables, or an average of almost 90 variables per model. Of those 7,000 variables, a little over 180 (about 2 per model) required further measurement according to the

information value calculation. Most of these, about 150, had to be decomposed further to find a more easily measured component of the uncertain variable. Other variables offered more direct and obvious methods of measurement, for example, having to determine the gas mileage of a truck on a gravel road (by just driving a truck with a fuel-flow meter) or estimating the number of bugs in software (by inspecting samples of code). But almost a third of the variables that were decomposed required no further measurement after decomposition. In other words, about 25% of the high-value measurements were addressed with decomposition alone.…

the EVPI is an upper limit on what you should be willing to spend even theoretically. But the best measurement expenditure is probably far below this maximum. As a ballpark estimate, I shoot for spending approximately 10% of the EVPI on a measurement and, depending on the circumstances, sometimes even as low as 2%. I use this estimate for three reasons: The EVPI is the value of perfect information. …

respondents, those of us who measure such things as the value of life and health have to face a misplaced sense of righteous indignation. Some studies have shown that about 25% of people in environmental value surveys refused to answer on the grounds that “the environment has an absolute right to be protected” regardless of cost. …

Resistance to valuing a human life may be part of a fear of numbers in general. Perhaps for these people, a show of righteous indignation is part of a defense mechanism. Perhaps they feel their “innumeracy” doesn’t matter as much if quantification itself is unimportant, or even offensive, especially on issues like these. …

Bootstamp

May 15, 2023

Why do you believe this? I feel like most of my own resistance (in small and large, personal and professional affairs) comes solely from Goodhart.

Very hopeful of being convinced to the contrary.

Expand full comment

kurt9

That's true, and that's why I use them more for predictive purposes (e.g. are we going to have war with Iran? Is coronavirus going to turn into an SF style plague? etc.) than all of the talking heads. The talking heads generally have them up their asses.

18 more comments...

Overcoming Bias

Discussion about this post

Ready for more?