Rah Power Laws

The latest Science has an article by Michael Stumpf and Mason Porter, complaining that people aren’t careful enough about fitting power laws. It mentions that a sum of heavy-tail-distributed things generically becomes has a power law tail in the sum limit. And it claims:

Although power laws have been reported in areas ranging from finance and molecular
biology to geophysics and the Internet, the data are typically insufficient and the mechanistic insights are almost always too limited for the identification of power-law behavior to be scientifically useful … Examination (15) of the statistical support for numerous reported power laws has revealed that the overwhelming majority of them failed statistical testing (sometimes rather epically).

Yet in reference 15, where Aaron Clauset, Cosma Rohilla Shalizi, and M. E. J. Newman looked carefully at 25 data sets that others had claimed fit power laws, only for 3 did they find less than moderate support for a power law fit, and in none of those cases was any other specific model significantly favored over a power law! It this is the best criticism they’ve got, this seems to me resounding support for power laws.

Here are the phenomena where the power is less than one, meaning the few biggest items get most of the weight:

intensity of wars 0.7(2); solar flare intensity 0.79(2); religious followers 0.8(1); count of word use 0.95(2)

The number is the power and the digit in parens is the uncertainty of the last digit shown. Here are the phenomena where the power is greater than one, meaning most weight goes to many small items:

telephone calls received 1.09(1); bird species sightings 1.1(2); Internet degree 1.12(9); blackouts 1.3(3); population of cities 1.37(8); terrorist attack severity 1.4(2); species per genus 1.4(2); freq. of surnames 1.5(2); protein interaction degree 2.1(3); citations to papers 2.16(6); email address books size 2.5(6); sales of books 2.7(3); papers authored 3.3(1)

For quake intensity they give power 0.64(4), but say a better fit is a different power (unspecified) and a cutoff. For net worth (of the US richest 400) they give power 1.3(1), but say a power-law doesn’t fit, though no other model tried fits better.

On catastrophic risk, I wrote in ’07:

We should worry more about disasters with lower powers, such as forest fires (area power of 0.66), hurricanes (dollar loss power of 0.98, death power of 0.58), earthquakes (energy power of 1, dollar loss and death powers of 0.41), wars (death power of 0.41), and plagues (death power of 0.26 for Whooping Cough and Measles).

So the above study suggests we worry most about wars, quakes, religions, and solar flares. I hadn’t been worried about solar flares so much before; now I am. On city inequality, I think I trust that other paper more.

Added 4p: Cosma Shalizi says:

In ten of the twelve cases we looked at, the only way to save the idea of a power-law at all is to include this exponential cut-off. But that exponentially-shrinking factor is precisely what squelches the WTF, X IS ELEVENTY TIMES LARGER THAN EVER! THE BIG ONE IS IN OUR BASE KILLING OUR DOODZ!!!!1!! mega-events.

I’m happy to admit that worse case fears are reduced by the fact that <1 power law data tend to be better fit by a tail cutoff. Good news! I don’t want to believe in disaster, but I do think we must consider that possibility.

GD Star Rating
Tagged as: ,
Trackback URL:
  • david

    Shalizi on the Clauset et al paper he coauthored. Clauset on the paper.

    Shalizi at least certainly seems to disagree with you:

    If you use sensible, heavy-tailed alternative distributions, like the log-normal or the Weibull (stretched exponential), you will find that it is often very, very hard to rule them out. In the two dozen data sets we looked at, all chosen because people had claimed they followed power laws, the log-normal’s fit was almost always competitive with the power law, usually insignificantly better and sometimes substantially better.

    This does not say “in none of those cases was any other specific model significantly favored over a power law”.

  • Anonymous

    I think you’re misreading the cited paper. “[S]even data sets all have p-values small enough that the power-law model can be firmly ruled out” but “log-normal, on the other hand, is not ruled out for any of our data
    sets except the HTTP connections.” I think you misread it by interpreting the “power-law with cutoff” in the table as a power law, but it is a substantially different distribution.

    In general, one can give more plausible generative models for log-normal distributions (which is a combination of random +/- 1% differences). Since log-normals also usually fit the data better than power laws, this means a log-normal is the better go-to distribution.

    That said, it’s not clear how much the differences between power laws, log-normals, and power-laws with cutoffs matter to you. As long as you’re only looking in the region where they are hard to distinguish, the conclusions you draw will be very similar. When you try to extrapolate outside this region, however, you should be careful.

    • People almost always mean that the tail of the distribution is a power law, not that the whole thing is. Log-normal might do a bit better, but the point is that the power law does fine.

      • Douglas Knight

        The cut-off means that the tail is not a power law.

      • Anonymous

        Whether a power law does fine depends on how you’re using it. If you’re looking at the region where you have data to fit, and find that the power law fits reasonably well, and want to make a point about inequality: I agree, it’s fine and simpler than fitting a lognormal.

        But you need to be more careful when extrapolating outside the region in which you have data. For example, suppose you’re trying to estimate existential risk based on data for catastrophes mostly in the range of 10 thousand-10 million deaths. Power law fitting will give you some line on the log-log plot. You then extrapolate to the billions to make some estimate of existential risk.

        But lognormals could give you a very different view. There you get a parabola on the log-log plot. While the difference from a straight line may be fairly small over the region in which you have data, the discrepancy increases quadratically as you move away. This will mean the difference between power law and log normal at 1B people (on a log log plot) is almost an order of magnitude worse than it was over the range you made the fit.

        Now, how bad this is depends on how big the difference is between power law and lognormal over the range in which you have data; and how big that can be before you would notice depends on how much data you have. But if the two differ by a factor of 2 at one location (and less on average), then your estimate of existential risk will be off by a factor of a thousand.

      • Ah, I was thinking of a cutoff on off the other end. Fair enough. But I think the usual meaning is that there is a power law over some substantial range.

  • Isn’t extrapolation the point: to divine some “existential risk”?

  • That talk at Oxford FHI was well worth the time.

  • ChristianKl

    I think the solar flare issue would warrant it’s own post. What the chance that a solar flare will do something bad in the next 10 year? If so can we protect ourself against it?

  • Pingback: Browsing Catharsis – 02.15.12 « Increasing Marginal Utility()

  • Pingback: Yesterday was a very power law day : David Jinkins Webpage()