11 Comments

Thanks - I added to the post above.

Expand full comment

Update.

Expand full comment

I think the solar flare issue would warrant it's own post. What the chance that a solar flare will do something bad in the next 10 year? If so can we protect ourself against it?

Expand full comment

That talk at Oxford FHI was well worth the time.

Expand full comment

Ah, I was thinking of a cutoff on off the other end. Fair enough. But I think the usual meaning is that there is a power law over some substantial range.

Expand full comment

Whether a power law does fine depends on how you're using it. If you're looking at the region where you have data to fit, and find that the power law fits reasonably well, and want to make a point about inequality: I agree, it's fine and simpler than fitting a lognormal.

But you need to be more careful when extrapolating outside the region in which you have data. For example, suppose you're trying to estimate existential risk based on data for catastrophes mostly in the range of 10 thousand-10 million deaths. Power law fitting will give you some line on the log-log plot. You then extrapolate to the billions to make some estimate of existential risk.

But lognormals could give you a very different view. There you get a parabola on the log-log plot. While the difference from a straight line may be fairly small over the region in which you have data, the discrepancy increases quadratically as you move away. This will mean the difference between power law and log normal at 1B people (on a log log plot) is almost an order of magnitude worse than it was over the range you made the fit.

Now, how bad this is depends on how big the difference is between power law and lognormal over the range in which you have data; and how big that can be before you would notice depends on how much data you have. But if the two differ by a factor of 2 at one location (and less on average), then your estimate of existential risk will be off by a factor of a thousand.

Expand full comment

The cut-off means that the tail is not a power law.

Expand full comment

People almost always mean that the tail of the distribution is a power law, not that the whole thing is. Log-normal might do a bit better, but the point is that the power law does fine.

Expand full comment

Isn't extrapolation the point: to divine some "existential risk"?

Expand full comment

I think you're misreading the cited paper. "[S]even data sets all have p-values small enough that the power-law model can be firmly ruled out" but "log-normal, on the other hand, is not ruled out for any of our datasets except the HTTP connections." I think you misread it by interpreting the "power-law with cutoff" in the table as a power law, but it is a substantially different distribution.

In general, one can give more plausible generative models for log-normal distributions (which is a combination of random +/- 1% differences). Since log-normals also usually fit the data better than power laws, this means a log-normal is the better go-to distribution.

That said, it's not clear how much the differences between power laws, log-normals, and power-laws with cutoffs matter to you. As long as you're only looking in the region where they are hard to distinguish, the conclusions you draw will be very similar. When you try to extrapolate outside this region, however, you should be careful.

Expand full comment

Shalizi on the Clauset et al paper he coauthored. Clauset on the paper.

Shalizi at least certainly seems to disagree with you:

If you use sensible, heavy-tailed alternative distributions, like the log-normal or the Weibull (stretched exponential), you will find that it is often very, very hard to rule them out. In the two dozen data sets we looked at, all chosen because people had claimed they followed power laws, the log-normal's fit was almost always competitive with the power law, usually insignificantly better and sometimes substantially better.

This does not say "in none of those cases was any other specific model significantly favored over a power law".

Expand full comment