Fig Leaf Models

While some argue that economist theorists should act more like biology/ecology theorists, some geologists are now arguing instead that environmental theorists should act more like economic theorists.  A New York Times book review of "Useless Arithmetic: Why Environmental Scientists Can’t Predict the Future" says

When coastal engineers decide whether to dredge sand and pump it onto an eroded beach, they use mathematical models to predict how much sand they will need, when and where they must apply it … Orrin H. Pilkey, … recommends … just dredge up a lot of sand and dump it on the beach willy-nilly. This "kamikaze engineering" might not last very long, he says, but projects built according to models do not usually last very long either, and at least his approach would not lull anyone into false mathematical certitude.  …

Dr. Pilkey and his daughter Linda … have expanded this view into an overall attack on the use of computer programs to model nature. … Their book … originated in a seminar Dr. Pilkey organized at Duke to look into the performance of mathematical models used in coastal geology.  …  seminar participants … [concluded] that erroneous assumptions, fudge factors and the reluctance to check predictions against unruly natural outcomes produce models with, as the authors put it, "no demonstrable basis in nature." … 


Given the problems with models, should we abandon them altogether? Perhaps, the authors say. Their favored alternative … [has] policymakers … make constant observations in the field, altering their policies as conditions change.

But that approach has drawbacks, among them requirements for assiduous monitoring, … Besides, they acknowledge, people seem to have such a powerful desire to defend policies with formulas (or "fig leaves," as the authors call them), that managers keep applying them, long after their utility has been called into question. 

So the authors offer some suggestions … Modeling should be transparent. That is, any interested person should be able to see and understand how the model works – what factors it weighs heaviest, what coefficients it includes, what phenomena it leaves out, and so on. Also, modelers should say explicitly what assumptions they make. 

And instead of demanding to know exactly how high seas will rise or how many fish will be left … we should seek to discern simply whether seas are rising, fish stocks are falling … Models should be regarded as producing "ballpark figures," they write, not accurate impact forecasts.

Economic theorists have long followed this advice to avoid making and believing large complex integrated models, in favor of focusing on sign predictions from models simple enough to be understandable.  And Pilkey’s suggestion that complex models can’t be trusted echoes my similar recent lament about complex economic models. 

GD Star Rating
loading...
Tagged as: ,
Trackback URL:
  • Stuart Armstrong

    Simple models are more understandable, less prone to hidden biases, but that’s no reason for them to be more accurate. Having them understandable by educated laymen is very dangerous – people can understand them, find them plausible, and then decide they are true. And then defend the model even if it turns out to be false.

    Think of a political idea X, say “more guns reduce crime” (“more guns increase crime” works just as well). Since people understand the reasoning behind this idea, they are far less likely to jetsion it if it’s wrong. The rational reaction to the headline “crime goes up as more guns flood the country” is “I should diminsh my estimation of X”. The reaction of people favourable to X is generally “X is true, but other factors are in play here”.

    We should beware the attraction of simple models; they are the hardest to get out, once they have a hold on people’s minds.

    Compare that with a complex, hard to follow, socio-economic-political-moral model with hundreds of variable interacting in non-intuitive ways. If that model predicts a near-certain decrease in crime, and crime goes up, then the public reaction is the correct one: the model is wrong.

    As far as the public, the engineers, the political deciders are concerned, models should be a black box – the inner workings totally incomprehensible. They ask a question to the model, putting in all the relvant factors, the model responds (with appropriate error bars).

    Then if the predictions are wrong, the model is wrong. No excuses. The biases will sort themselves out through correcting the model to make it in tune with reality.

  • http://cob.jmu.edu/rosserjb Barkley Rosser

    I would warn that economists have given up on large-scale modeling. Certainly there are, and always have been, many economists who eschew such approaches for a variety of reasons. But the basement of the Fed is crawling with people toiling over the latest variation of the DSGE model that is reportedly being used to generate reports and forecasts for the FOMC to contemplate.

    The latest fad in “environmental” models that are being used very precisely in economics are earthquake models from geophysics, being applied to model crashes in financial asset markets. See work by Sornette.

  • http://profile.typekey.com/halfinney/ Hal Finney

    The current debate about global warming and climate change is a good example of the heavy use of economic models in order to guide present day policies. Since most harm from climate change will be far in the future, we need to make many assumptions about that far off period in order to judge whether proposed remediations are a net benefit. Given the uncertainty involved, different analyses have come up with drastically different tradeoffs. The recent Stern report is a good example.

    But what is the alternative? We can’t just wait until the harm starts occuring (or not) and then change our policies, because of the decades-long lead time between policy change and effective results. So it seems we are forced to do our best with the models, even with all their uncertainties.

  • http://cob.jmu.edu/rosserjb Barkley Rosser

    Obviously I meant to warn that economists had NOT given up on large-scale modeling, at least some of them.

  • eric

    Stuart argues that making models transparent is bad because people would rationalize strong beliefs in spite of new data. Censoring the data, clouding the theory, is always a bad idea. Sure bad things can happen, but the alternative is much worse. While I think there is a case for closed door hearing on public issues to prevent grandstanding, generally more transparency, more clarity, more discussion is optimal. If not, we’re screwed as a species anyway.

  • Stuart Armstrong

    I’m not advocating censoring the data; that would indeed screw us.

    Maybe I was a bit extreme there. I just feel that models should be judged by their predictions, not their assumptions. When explained to the public, the words “this is a very simplified picture, the real picture is more complicated” should be on everyone’s minds. And the fact that a model is simple should never be seen as making it more true.

    I also think that critiques from other scientists should not focus on the assumptions, as seems to be very fashionable now, but on the predictions (I know there is a tendency in social sciences to roll assumptions and predictions into one, and I think that’s very unhealthy).

    And conversely one should not claim “this model’s predictions are true, this validates its assumptions”, unless that is really the case (which is virtually never).

    Hopefully, with the focus on the predictions, there will be less bias in the assumptions, because 1) they will be less ideologically important and 2) they need to be accurate if the model is to be.

  • http://profile.typekey.com/halfinney/ Hal Finney

    Stuart, one reason people prefer to judge models by simplicity rather than predictions is because in practice, many models are tested more by retrodictions than predictions. In many cases, it would take a long time before we could verify that a model prediction turned out to be correct. Testing models against past data is more common. In such situations, model simplicity is a guard against the danger of over-fitting.

  • http://dao.complexitystudies.org/2007/02/22/useless-arithmetic/ complexitystudies

    Useless Arithmetic

    A new book has come out which casts into serious doubt some of the models which have influenced policy-makers so far. This further convinces me that it is time to lay a serious philosophy of science foundation for modelling; especially the semantics of…

  • http://profile.typekey.com/sentience/ Eliezer Yudkowsky

    “And the fact that a model is simple should never be seen as making it more true.” — What happened to Solomonoff induction? Or, less quantitatively, Occam’s Razor? Okay, technically Occam’s Razor doesn’t make a statement more true – just more probable. But still, ??

  • ChrisA

    There are cases of very complex models of reality in common use that achieve very close alignment with reality. The models are very large and are based on fundamental theories of chemistry and physics. Chemical engineers for instance can design a chemical plant with computer based modelling tools (such as Hysys) successfully predicting performance to within 1%, an incredible achievement when you consider how complex (millions of components) such plants are. A simple model would definitely be less useful than these complex models. I believe there are similarly accurate models in use in the car industry. So not all complex models are like the coastal engineers ones.

    Basically though the argument is true that most complex models can’t be trusted. Superficially there is a great similarity between engineering types of models mentioned above and, say, the complex models used to predict global warming or complex economic models. However the difference is that the engineering problems are known to be solvable and the integrated model predictions have been tested against reality. This solves the big problems in modelling which is to know what not to include in the model (by definition a model must leave some aspects of reality unmodelled) and whether we have enough understanding of the system to make good predictions about the performance of the system. So everything in a model can be true (and empirically verified on an individual component basis) but the model itself can give bad predictions, because of some unknown factor or just because we can’t model in sufficient detail.

    As a guide then in believing a model, it is not how complex it is, or whether the individual elements are correct, but whether i) the model predictions can be tested or not and ii) whether or not you believe we fully understand the system and iii) whether or not we can fully model the system in enough detail. Note that the accuracy of a model does not depend on whether it agrees with the predictions made by other models, just whether it agrees with reality, (also a common mistake).

  • http://www.aleph.se/andart/ Anders Sandberg

    William B. Levy, an experienced computational neuroscientist, once gave me sage advice: “Make sure you get more predictions out of your model than you have parameters”.

    That ought to be the prime rule for every model. Complex models making complex predictions are all right, as long as you can test the predictions. Simple models making simple predictions are even better, since there is less work involved and they are more easily scrutinized. If they manage to produce many and complex results, then they are really interesting (and easy to reject when wrong). It is the complex models for simple predictions that really should be avoided. They waste effort, allow deliberate or accidental fudging, cannot be understood well and are hard to reject.

    I guess it can be seen as an information principle. If the description of how to get a result is longer than the result, the description is not very good.

  • Lee

    “And instead of demanding to know exactly how high seas will rise or how many fish will be left … we should seek to discern simply whether seas are rising, fish stocks are falling”

    This strikes me as silly. Why should zero be the arbitrary number from which we judge things? Why don’t scientists simply tell us whether the sea and fish stock are growing at a rate to the left or to the right of seven (in some units)?

    That’s no joke — there is no reason to suppose that things by default have a baseline growth rate of zero. That’s just a prejudice. I am interested in how fast fish stocks are falling.

  • http://profile.typekey.com/robinhanson/ Robin Hanson

    Lee, simple models tend to pick their own natural reference points, from which it is much easier to predict sign than magnitude of effect.

    Stuart, I am reluctant to endorse explicit biases supposedly to correct poorly documented biases.

  • rcriii

    Anders, I am inclined to argue with Mr. Levy’s advice, on the grounds that you usually need at least as many known quantities as the unknowns you are solving for. But I fear that I am assuming that the models in question are linear equations.

    Maybe a better way of putting it would be to say that intermediate results should match observations as well as the ultimate parameter of interest. For example in the beach model, if the model correctly predicts the width of beach above the waterline after 6 months, but is wrong about the underwater slope, then it is not trustworthy.

  • http://profile.typekey.com/hollerith/ Richard Hollerith

    Maybe I should have let Anders reply, but to me “‘Make sure you get more predictions out of your model than you have parameters'” is the same as “you usually need at least as many known quantities as the unknowns you are solving for,” so I fail to see why rcriii is inclined to argue.

  • http://profile.typekey.com/halfinney/ Hal Finney

    I think the problem with complex models, as I said before, is that most “predictions” are actually retrodictions.

    If a model is making truly accurate predictions, and they are informative and useful, then you shouldn’t care how many parameters it has. If someone can consistently predict tomorrow’s stock prices today, you don’t care how many parameters he is using. His model is good, by virtue of its predictive success in a difficult domain.

  • http://profile.typekey.com/hollerith/ Richard Hollerith

    So, having scientists generate scientific models is prone to severe biases. Has anyone tried to remove those biases by having a computer program pick and refine the models according to the algorithm first published by Solomonoff? Scientists working in some field would labor long and hard to put most of the relevant knowledge about the field into a formal model, then spend their time feeding empirical data into the computer program that refines the model and interpreting the model for engineers, policy makers and other “consumers” of scientific knowledge.

    Economics would be a good field to try this with because of the difficulty of acquiring new empirical data (leading economists to rely heavily on retrodictions, e.g., of stock market data going back to the start of stock markets) and the strong tendency of humans to have prejudices about economics and to form factions of economic belief. Then again, economics would be a bad field to try this in because the economic realm is affected by a vast tangle of complex causal chains; e.g., the very complex realm of human psychology strongly affects the economic realm. Thus the cost (in researcher’s time) of creating a formal model of economics that includes most of the relevant knowledge that economists have about economics would be sky high.

    Climate science is another area that necessarily relies heavily on retrodictions and is prone to factionalization, but is not nearly as complex as economics (but might still be too complex to codify as a single formal model).

    I would appreciate people sharing their impressions on the desirability and feasibility of taking humans out of the model-picking and model-refinement loop in selected fields, e.g., climate science.

  • Stuart Armstrong

    I think the problem with complex models, as I said before, is that most “predictions” are actually retrodictions.

    Very good point there. In the absence of easy testing, then simple models do have an advantage.

    It is the complex models for simple predictions that really should be avoided.

    That feels generally true. But what about, say, old-fashioned military ballistics? A complicated model, fed with many variables (temperature, pressure, wind speed, etc…) whose sole prediction is where a shell will land on a two dimensional plane.

    (And in principle (as long as it’s tested against predictions, not retrodictions), there’s nothing wrong with a model that would take the information of the entire world stock exchanges, plus the murder rate in every Brazilian town, to accurately predict the price of fish tomorrow.)

    A slight change of your information model would put the input information as “the number of free parameters in the system” and the output as “the amount of tested predictions”.

    Then ballistics models are fine, as the amount of information output is basically infinite – just keep firing the gun, again and again.

    Complicated economics models would be a problem though, as you might have to wait a long time to test enough predictions to have more information that your free parameters input. So complex models should have higher standards of proof than simple ones – requiring either complicated results, or lots and lots of simple results.

  • Stuart Armstrong

    I would appreciate people sharing their impressions on the desirability and feasibility of taking humans out of the model-picking and model-refinement loop in selected fields, e.g., climate science.

    I’m entirely for it (as long as precautions are taken to test the models on enough results, vis Anders information point). It would take the ideology and biases out of the assumptions, and let the model live or die only on its results.

    However, once you’re got a model, then humans will pick it apart, try and understand it, and build new models. You need that for science to progress, but new biases will then be introduced – to hopefully be eliminated later.

    As for feasibility… no idea! :-)

  • Stuart Armstrong

    Stuart, I am reluctant to endorse explicit biases supposedly to correct poorly documented biases.

    I’m sorry, I don’t quite get what you mean. I don’t think I was advocating explicit biases (just the fact that lay people shouldn’t feel they need to understand every assumption of a model). Can you clarify?

  • http://profile.typekey.com/robinhanson/ Robin Hanson

    Stuart, I was referring to your saying “Simple models are … very dangerous – people can understand them, find them plausible, and then decide they are true. And then defend the model even if it turns out to be false.” You are positing an undocumented bias to too easily believe simple models, and suggesting we correct for that bias by avoiding simple models, relative to their other benefits.

  • http://profile.typekey.com/sentience/ Eliezer Yudkowsky

    All,

    A substantial amount of effort in statistics and machine learning goes into just figuring out how much freedom a model has, so that it can be penalized accordingly – for example, the “effective degrees of freedom” of regularized linear regressions.

    Once you know the number of the effective degrees of freedom, there’s the question of how to penalize them – two popular approaches are the Akaike Information Criterion (which underestimates the penalty) and the Bayesian Information Criterion (which overestimates it). Some major approaches don’t count degrees of freedom at all. Vapnik-Chervonenkis dimension views the affair from the standpoint of the data, not the model, asking what is the most complex data that the model class can fit exactly – if a model involves a hundred variables, but can only fit a very limited class of potential outcomes, we should pay more attention to it than a hundred-variable model which can precisely fit a much broader class of outcomes. The more advanced versions of Minimum Description Length take account of the precisions of parameters and predictions, not just their number (but are often extremely hard to compute).

    However, I think everyone pretty much agrees on that:

    (1) All else being equal, having more effective freedom (not the same as the raw number of variables in the model) is bad for you, just as, all else being equal, having more data is good for you;
    (2) There is a deep connection between the relative simplicity of a model, compared to the amount of training data it is fitted to, and that model’s ability to generalize to new data.

  • Stuart Armstrong

    You are positing an undocumented bias to too easily believe simple models, and suggesting we correct for that bias by avoiding simple models, relative to their other benefits.

    Ah, I see. You were suggesting that complicated models are prone to bias because of what their creators put into them; I was suggesting that simple models are prone to bias because they are less likely to be rejected when contrary evidence arrives.

    Is this the case? Politically, it does seem that people do cling to simpler models rather than complicated ones – the enthusiasm generated by models like “invade iraq to bring peace” and “get out of iraq to bring peace” is all out of proportion to that of “let us negociate a rational compromise, involving this, this and this and even this, to bring peace.” On a personal note, my bosses would continually try and get the simpler model they understood adopted, rather than the more complicated ones that were really called for.

    But it would be good to get this studied properly (could not easily find any studies online, don’t have access to most journals). Then we could put up procedures to avoid both types of bias, if warranted (restricting complex models to retrodictions, not predictions, and publicly announcing the failure of a simple model when it does fail – rather than the usual “the model needs to be refined”; maybe, but the model, as is, has failed).

    But I think the most important is a splitting of the predictions of a model from its assumptions, as the second are prone to bias and the first are testable. I feel that this is more easy to do in complicated models.

  • Stuart Armstrong

    Another take entirely (just one page):

    A Note on Simple Models

    What implicitly come out from the example he uses is that models are often simple from the point of view of those making them, but only them (Bayseian analysis, for instance, is only simple once you’ve properly understood it). Might be another bias; not sure of its implications here.

  • http://profile.typekey.com/hollerith/ Richard Hollerith

    Stuart: it is testing a model against predictions (not retrodictions) that guards against human biases (in the choice of model).