Author Archives: Robin Hanson

Why Ethnicity & Ideology? 

Individual humans can be described via many individual features that are useful in predicting what they do. Such features include gender, age, personality, intelligence, ethnicity, income, education, profession, height, geographic location, and so on. Different features are more useful for predicting different kinds of behavior.

One kind of human behavior is coalition politics; we join together into coalitions within political and other larger institutions. People in the same coalition tend to have features in common, though which exact features varies by time and place. But while in principle the features that describe coalitions could vary arbitrarily by time and place, we in actual fact see more consistent patterns.

Now when forming groups based on shared features, it make senses to choose features that matter more in individual lives. The more life decisions a feature influences, the more those who share this feature may plausibly share desired policies, policies that their coalition could advocate. So you might expect political coalitions to be mostly based on individual features that are very useful for predicting individual behavior.

But you’d be wrong. While there are often weak correlations with such features, political coalitions are not mainly based on the main individual features of gender, age, etc. Instead, they are more often based on ethnicity, [class,] and “political ideology” preferences, which while famously difficult to characterize, and which do vary by time and place, are also somewhat consistent across time and space.

In this post, I just want to highlight this puzzle, not solve it: why are these the most common individual features on which political coalitions are based? Yes, in some times and places ethnicity [and class] matter so much that they strongly predict individual behavior. But even when they don’t matter much for policy preferences, they are still often the basis of coalitions. And why is political ideology so attractive a basis for coalitions, when it matters so little in individual lives?

I see two plausible types of theories here. One is a theory of current functionality; somehow these features actually do capture the individual features that best predict member positions on typical issues. Another is a theory of past functionality; perhaps in long-past forager environments, something like these features were the most relevant. I now lean toward this second type of theory.

GD Star Rating
Tagged as: ,

Ems in Walkaway

Some science fiction (sf) fans have taken offense at my claim that non-fiction analysis of future tech scenarios can be more accurate than sf scenarios, whose authors have other priorities. So I may periodically critique recent sf stories with ems for accuracy. Note that I’m not implying that such stories should have been more accurate; sf writing is damn hard work and its authors juggle a many difficult tradeoffs. But many seem unaware of just how often accuracy is sacrificed.

The most recent sf I’ve read that includes ems is Walkaway, by “New York Times bestselling author” Cory Doctorow, published back in April:

Now that anyone can design and print the basic necessities of life—food, clothing, shelter—from a computer, there seems to be little reason to toil within the system. It’s still a dangerous world out there, the empty lands wrecked by climate change, dead cities hollowed out by industrial flight, shadows hiding predators animal and human alike. Still, when the initial pioneer walkaways flourish, more people join them.

The emotional center of Walkaway is elaborating this vision of a decentralized post-scarcity society trying to do without property or hierarchy. Though I’m skeptical, I greatly respect attempts to describe such visions in more detail. Doctorow, however, apparently thinks we economists make up bogus math for the sole purpose of justifying billionaire wealth inequality. Continue reading "Ems in Walkaway" »

GD Star Rating
Tagged as: ,

Organic Prestige Doesn’t Scale

Some parts of our world, such as academia, rely heavily on prestige to allocate resources and effort; individuals have a lot of freedom to choose topics, and are mainly rewarded for seeming impressive to others. I’ve talked before about how some hope for a “Star Trek” future where most everything is done that way, and I’m now reading Walkaway, outlining a similar hope. I was skeptical:

In academia, many important and useful research problems are ignored because they are not good places to show off the usual kinds of impressiveness. Trying to manage a huge economy based only on prestige would vastly magnify that inefficiency. Someone is going to clean shit because that is their best route to prestige?! (more)

Here I want to elaborate on this critique, with the help of a simple model. But first let me start with an example. Imagine a simple farming community. People there spend a lot of time farming, but they must also cook and sew. In their free time they play soccer and sing folk songs. As a result of doing all these things, they tend to “organically” form opinions about others based on seeing the results of their efforts at such things. So people in this community try hard to do well at farming, cooking, sewing, soccer, and folk songs.

If one person put a lot of effort into proving math theorems, they wouldn’t get much social credit for it. Others don’t naturally see outcomes from that activity, and not having done much math they don’t know how to judge if this math is any good. This situation discourages doing unusual things, even if no other social conformity pressures are relevant.

Now let’s say that in a simple model. Let there be a community containing people j, and topic areas i where such people can create accomplishments aij. Each person j seeks a high personal prestige pj = Σi vi aij, where vi is the visibly of area i. They also face a budget constraint on accomplishment, Σi aij2 ≤ bi. This assumes diminishing returns to effort in each area.

In this situation, each person’s best strategy is to choose aij proportional to vi. Assume that people tend to see the areas where they are accomplishing more, so that visibility vi is proportional to an average over individual aij. We now end up with many possible equilibria having different visibility distributions. In each equilibria, for all individuals j and areas i,k we have the same area ratios aij / akj = Vi/ Vk.

Giving individuals different abilities (such as via a budget constraint Σi aij2 / xij ≤ bi) could make individual choose somewhat different accomplishments, but the same overall result obtains. Spillovers between activities in visibility or effort can have similar effects. Making some activities be naturally more visible might push toward those activities, but there could still remain many possible equilibria.

This wide range of equilibria isn’t very reassuring about the efficiency of this sort of prestige. But perhaps in a small foraging or farming community, group selection might over a long run push toward an efficient equilibria where the high visibility activates are also the most useful activities. However, larger societies need a strong division of labor, and with such a division it just isn’t feasible for everyone to evaluate everyone else’s specific accomplishments. This can be solved either by creating a command and status hierarchy that assigns people to tasks and promotes by merit, or by an open market with prestige going to those who make the most money. People often complain that doing prestige in these ways is “inauthethnic”, and they prefer the “organic” feel of personally evaluating others’ accomplishments. But while the organic approach may feel better, it just doesn’t scale.

In academia today, patrons defer to insiders so much regarding evaluations that disciplines become largely autonomous. So economists evaluate other economists based mostly on their work in economics. If someone does work both in economics and also in aother area, they are judged mostly just on their work in economics. This penalizes careers working in multiple disciplines. It also suggests doubts on if different disciplines get the right relative support – who exactly can be trusted to make such a choice well?

Interestingly, academic disciplines are already organized “inorganically” internally. Rather than each economist evaluating each other economist personally, they trust journal editors and referees, and then judge people based on their publications. Yes they must coordinate to slowly update shared estimates of which publications count how much, but that seems doable informally.

In principle all of academia could be unified in this way – universities could just hire the candidates with the best overall publication (or citation) record, regardless of in which disciplines they did what work. But academia hasn’t coordinated to do this, nor does it seem much interested in trying. As usual, those who have won by existing evaluation criteria are reluctant to change criteria, after which they would look worse compared to new winners.

This fragmented prestige problem hurts me especially, as my interests don’t fit neatly into existing groups (academic and otherwise). People in each area tend to see me as having done some interesting things in their area, but too little to count me as high status; they mostly aren’t interested in my contributions to other areas. I look good if you count my overall citations, for example, but not if you only my citations or publications in each specific area.

GD Star Rating
Tagged as: , ,

Compare Institutions To Institutions, Not To Perfection

Mike Thicke of Bard College has just published a paper that concludes:

The promise prediction markets to solve problems in assessing scientific claims is largely illusory, while they could have significant unintended consequences for the organization of scientific research and the public perception of science. It would be unwise to pursue the adoption of prediction markets on a large scale, and even small-scale markets such as the Foresight Exchange should be regarded with scepticism.

He gives three reasons:

[1.] Prediction markets for science could be uninformative or deceptive because scientific predictions are often long-term, while prediction markets perform best for short-term questions. .. [2.] Prediction markets could produce misleading predictions due to their requirement for determinable predictions. Prediction markets require questions to be operationalized in ways that can subtly distort their meaning and produce misleading results. .. [3.] Prediction markets offering significant profit opportunities could damage existing scientific institutions and funding methods.

Imagine that you want to travel to a certain island. Some else tells you to row a boat there, but I tell you that a helicopter seems more cost effective for your purposes. So the rowboat advocate replies, “But helicopters aren’t as fast as teleportation, they take longer and cost more when to go longer distances, and you need more expert pilots to fly in worse weather.” All of which is true, but not very helpful.

Similarly, I argue that with each of his reasons, Thicke compares prediction markets to some ideal of perfection, instead of to the actual current institutions it is intended to supplement. Lets go through them one by one. On 1:

Even with rational traders who correctly assess the relevant probabilities, binary prediction markets can be expected to have a bias towards 50% predictions that is proportional to their duration. .. it has been demonstrated both empirically and theoretically .. long-term prediction markets typically have very low trading volume, which makes it unlikely that their prices react correctly to new information. .. [Hanson] envisions Wegener offering contracts ‘to be judged by some official body of geologists in a century’, but this would not have been an effective criterion given the problem of 50%-bias in long-term prediction markets. .. Prediction markets therefore would have been of little use to Wegener.

First a predictable known distortion isn’t a problem at all for forecasts; just invert the distortion to get the accurate forecast. Second, this is much less of an issue in combinatorial markets, where all questions are broken into thousands or more tiny questions, all of which have tiny probabilities, and a global constraint ensures they all add up to one. But more fundamentally, all institutions face the same problem that all else equal, it is easier to give incentives for accurate short term predictions, relative to long term ones. This doesn’t show that prediction markets are worse in this case than status quo institutions. On 2:

Even if prediction markets correctly predict measured surface temperature, they might not predict actual surface temperature if the measured and actual surface temperatures diverge. .. Globally averaged surface air temperature [might be] a poor proxy for overall global temperature, and consequently prediction market prices based on surface air temperature could diverge from what they purport to predict: global warming. .. If interpreting the results of these markets requires detailed knowledge of the underlying subject, as is needed to distinguish global average surface air temperature from global average temperature, the division of cognitive labour promised by these markets will disappear. Perhaps worse, such predictions could be misinterpreted if people assume they accurately represent what they claim to.

All social institutions of science must deal with the facts that there can be complex connections between abstract theories and specific measurements, and that ignorant outsiders may misinterpret summaries. Yes prediction market summaries might mislead some, but then so can grant and article abstracts, or media commentary. No, prediction markets can’t make all such complexities go away. But this hardly means that prediction markets can’t support a division of labor. For example, in combinatorial prediction markets different people can specialize in the connections between different variables, together managing a large Bayesian network of predictions. On 3:

If scientists anticipate that trading on prediction markets could generate significant profits, either due to being subsidized .. or due to legal changes allowing significant amounts of money to be invested, they could shift their attention toward research that is amenable to prediction markets. The research most amenable to prediction markets is short-term and quantitative: the kind of research that is already encouraged by industry funding. Therefore, prediction markets could reinforce an already troubling push toward short-term, application-oriented science. Further, scientists hoping to profit from these markets could withhold salient data in anticipation of using that data to make better informed trades than their peers. .. If success in prediction markets is taken as a marker of scientific credibility, then scientists may pursue prediction-oriented research not to make direct profit, but to increase their reputation.

Again, all institutions work better on short term questions. The fact that prediction markets also work better on short term questions does not imply that using them creates more emphasis on short term topics, relative to using some other institution. Also, every institution of science must offer individuals incentives, incentives which distract them from other activities. Such incentives also imply incentives to withhold info until one can use that info to one’s maximal advantage within the system of incentives. Prediction markets shouldn’t be compared to some perfect world where everyone shares all info without qualification; such worlds don’t exist.

Thicke also mentioned:

Although Hanson suggests that prediction market judges may assign non-binary evaluations of predictions, this seems fraught with problems. .. It is difficult to see how such judgements could be made immune from charges of ideological bias or conflict of interest, as they would rely on the judgement of a single individual.

Market judges don’t have to be individuals; there could be panels of judges. And existing institutions are also often open to charges of bias and conflicts of interest.

Unfortunately many responses to reform proposals fit the above pattern: reject the reform because it isn’t as good as perfection, ignoring the fact that the status quo is nothing like perfection.

GD Star Rating
Tagged as:

Hazlett’s Political Spectrum

I just read The Political Spectrum by Tom Hazlett, which took me back to my roots. Well over three decades ago, I was inspired by Technologies of Freedom by Ithiel de Sola Pool. He made the case both that great things were possible with tech, and that the FCC has mismanaged the spectrum. In grad school twenty years ago, I worked on FCC auctions, and saw mismanagement behind the scenes.

When I don’t look much at the details of regulation, I can sort of think that some of it goes too far, and some not far enough; what else should you expect from a noisy process? But reading Hazlett I’m just overwhelmed by just how consistently terrible is spectrum regulation. Not only would everything have been much better without FCC regulation, it actually was much better before the FCC! Herbert Hoover, who was head of the US Commerce Department at the time, broke the spectrum in order to then “save” it, a move that probably helped him rise to the presidency:

“Before 1927,” wrote the U.S. Supreme Court, “the allocation of frequencies was left entirely to the private sector . . . and the result was chaos.” The physics of radio frequencies and the dire consequences of interference in early broadcasts made an ordinary marketplace impossible, and radio regulation under central administrative direction was the only feasible path. “Without government control, the medium would be of little use because of the cacaphony [sic] of competing voices.”

This narrative has enabled the state to pervasively manage wireless markets, directing not only technology choices and business decisions but licensees’ speech. Yet it is not just the spelling of cacophony that the Supreme Court got wrong. Each of its assertions about the origins of broadcast regulation is demonstrably false. ..

The chaos and confusion that supposedly made strict regulation necessary were limited to a specific interval—July 9, 1926, to February 23, 1927. They were triggered by Hoover’s own actions and formed a key part of his legislative quest. In effect, he created a problem in order to solve it. ..

Radio broadcasting began its meteoric rise in 1920–1926 under common-law property rules .. defined and enforced by the U.S. Department of Commerce, operating under the Radio Act of 1912. They supported the creation of hundreds of stations, encouraged millions of households to buy (or build) expensive radio receivers. .. The Commerce Department .. designated bands for radio broadcasting. .. In 1923, .. [it] expanded the number of frequencies to seventy, and in 1924, to eighty-nine channels .. [Its] second policy was a priority-in-use rule for license assignments. The Commerce Department gave preference to stations that had been broadcasting the longest. This reflected a well-established principle of common law. ..

Hoover sought to leverage the government’s traffic cop role to obtain political control. .. In July 1926, .. Hoover announced that he would .. abandon Commerce’s powers. .. Commerce issued a well-publicized statement that it could no longer police the airwaves. .. The roughly 550 stations on the air were soon joined by 200 more. Many jumped channels. Conflicts spread, annoying listeners. Meanwhile, Commerce did nothing. ..

Now Congress acted. An emergency measure .. mandated that all wireless operators immediately waive any vested rights in frequencies ..  the Radio Act … provided for allocation of wireless licenses according to “public interest”.  .. With the advent of the Federal Radio Commission in 1927, the growth of radio stations—otherwise accommodated by the rush of technology and the wild embrace of a receptive public—was halted. The official determination was that less broadcasting competition was demanded, not more.

That was just the beginning. The book documents so so much more that has gone very wrong. Even today, vast valuable spectrum is wasted broadcasting TV signals that almost no one uses, as most everyone gets cable TV. In addition,

The White House estimates that nearly 60 percent of prime spectrum is set aside for federal government use .. [this] substantially understates the amount of spectrum it consumes.

Sometimes people argue that we need an FCC to say who can use which spectrum because some public uses are needed. After all, not all land can be private, as we need public parks. Hazlett says we don’t use a federal agency to tell everyone who gets which land. Instead the public buys general land to create parks. Similarly, if the government needs spectrum, it can buy it just like everyone else. Then we’d know a lot better how much any given government action that uses spectrum is actually costing us.

Is the terrible regulation of spectrum an unusual case, or is most regulation that bad? One plausible theory is that we are more willing to believe that a strange complex tech needs regulating, and so such things tend to be regulated worse. This fits with nuclear power and genetically modified food, as far as I understand them. Social media has so far escaped regulation because it doesn’t seem strange – it seems simple and easy to understand. It has complexities of course, but behind the scenes.

GD Star Rating
Tagged as: ,

Foom Justifies AI Risk Efforts Now

Years ago I was honored to share this blog with Eliezer Yudkowsky. One of his main topics then was AI Risk; he was one of the few people talking about it back then. We debated this topic here, and while we disagreed I felt we made progress in understanding each other and exploring the issues. I assigned a much lower probability than he to his key “foom” scenario.

Recently AI risk has become something of an industry, with far more going on than I can keep track of. Many call working on it one of the most effectively altruistic things one can possibly do. But I’ve searched a bit and as far as I can tell that foom scenario is still the main reason for society to be concerned about AI risk now. Yet there is almost no recent discussion evaluating its likelihood, and certainly nothing that goes into as much depth as did Eliezer and I. Even Bostrom’s book length treatment basically just assumes the scenario. Many seem to think it obvious that if one group lets one AI get out of control, the whole world is at risk. It’s not (obvious).

As I just revisited the topic while revising Age of Em for paperback, let me try to summarize part of my position again here. Continue reading "Foom Justifies AI Risk Efforts Now" »

GD Star Rating
Tagged as: , ,

Philosophy Vs. Duck Tests

Philosophers, and intellectuals more broadly, love to point out how things might be more complex than they seem. They identify more and subtler distinctions, suggest more complex dependencies, and warn against relying on “shallow” advisors less “deep” than they. Subtly and complexity is basically what they have to sell.

I’ve often heard people resist such sales pressure by saying things like “if it looks like a duck, walks like a duck, and quacks like a duck, it’s a duck.” Instead of using complex analysis and concepts to infer and apply deep structures, they prefer to such use a “duck test” and judge by adding up many weak surface clues. When a deep analysis disagrees with a shallow appearance, they usually prefer to go shallow.

Interestingly, this whole duck example came from philosophers trying to warn against judging from surface appearances: Continue reading "Philosophy Vs. Duck Tests" »

GD Star Rating
Tagged as: ,

High Dimensional Societes?

I’ve seen many “spatial” models in social science. Such as models where voters and politicians sit at points in a space of policies. Or where customers and firms sit at points in a space of products. But I’ve never seen a discussion of how one should expect such models to change in high dimensions, such as when there are more dimensions than points.

In small dimensional spaces, the distances between points vary greatly; neighboring points are much closer to each other than are distant points. However, in high dimensional spaces, distances between points vary much less; all points are about the same distance from all other points. When points are distributed randomly, however, these distances do vary somewhat, allowing us to define the few points closest to each point as that point’s “neighbors”. “Hubs” are closest neighbors to many more points than average, while “anti-hubs” are closest neighbors to many fewer points than average. It turns out that in higher dimensions a larger fraction of points are hubs and anti-hubs (Zimek et al. 2012).

If we think of people or organizations as such points, is being a hub or anti-hub associated with any distinct social behavior?  Does it contribute substantially to being popular or unpopular? Or does the fact that real people and organizations are in fact distributed in real space overwhelm such things, which only only happen in a truly high dimensional social world?

GD Star Rating
Tagged as:

“Human” Seems Low Dimensional

Imagine that there is a certain class of “core” mental tasks, where a single “IQ” factor explains most variance in such task ability, and no other factors explained much variance. If one main factor explains most variation, and no other factors do, then variation in this area is basically one dimensional plus local noise. So to estimate performance on any one focus task, usually you’d want to average over abilities on many core tasks to estimate that one dimension of IQ, and then use IQ to estimate ability on that focus task.

Now imagine that you are trying to evaluate someone on a core task A, and you are told that ability on core task B is very diagnostic. That is, even if a person is bad on many other random tasks, if they are good at B you can be pretty sure that they will be good at A. And even if they are good at many other tasks, if they are bad at B, they will be bad at A. In this case, you would know that this claim about B being very diagnostic on A makes the pair A and B unusual among core task pairs. If there were a big clump of tasks strongly diagnostic about each other, that would show up as another factor explaining a noticeable fraction of the total variance. Making this world higher dimensional. So this claim about A and B might be true, but your prior is against it.

Now consider the question of how “human-like” something is. Many indicators may be relevant to judging this, and one may draw many implications from such a judgment. In principle this concept of “human-like” could be high dimensional, so that there are many separate packages of indicators relevant for judging matching packages of implications. But anecdotally, humans seem to have a tendency to “anthropomorphize,” that is, to treat non-humans as if they were somewhat human in a simple low-dimensional way that doesn’t recognize many dimensions of difference. That is, things just seem more or less human. So the more ways in which something is human-like, the more you can reasonably guess that it will be human like in other ways. This tendency appears in a wide range of ordinary environments, and its targets include plants, animals, weather, planets, luck, sculptures, machines, and software. Continue reading "“Human” Seems Low Dimensional" »

GD Star Rating
Tagged as: , ,

Boost For Being Best

The fraction of a normal distribution that is six or more standard deviations above the mean is one in ten billion. But the world has almost eight billion people in it. So in principle we should be able to get six standard deviations in performance gain by selecting the world’s best person at something, compared to using an average person.

I’m revising Age of Em for a paperback edition, expected in April. The rest of this post is from a draft of new text elaborating that point, and its implication for em leisure:

Em workers also earn wage premiums when they are the very best in the world at what they do. Even under the most severe wage competition, a best em can earn an extra wage equal to the difference between their productivity and the productivity of the second best em. When clans coordinate internally on wage negotiations, this is the difference in productivity between clans. (Clans who can’t coordinate internally are selected out of the em world, as they don’t cover their fixed costs, such as for training and marketing.)

Out of 10 billion independently and normally distributed (IID) samples, the maximum is on average about 6.4 standard deviations above the mean. Average spacings between the second, third, fourth highest samples are roughly 0.147, 0.075, and 0.05 standard deviations respectively (Branwen 2017). So when ems are selected out of 10 billion humans, the best em clan may be this much better than other em clans on normally distributed parameters. Using the log-normal wage distribution observed in our world (Provenzano 2015), this predicts that the best human in the world at any particular task is four to five times more productive than the median person, is over three percent more productive than the second most productive person, and is five percent more productive than the third most productive person.

If em clan relative productivity is drawn from this same distribution, if maximum em productivity comes at a 70 hour workweek, and if the best and second best em clans do not coordinate on wages they accept, then even under the strongest wage competition between clans, the best clan could take an extra 20 minutes a day more leisure, or two minutes per work hour, in addition to the six minutes per hour and other work breaks they take to be maximally productive.

This 20 minute figure is an underestimate for four reasons. First, the effective sample size of ems is smaller due to age limits on desirable ems. Second, most parameters are distributed so that the tails are thicker than in the normal distribution (Reed and Jorgensen 2004).

Third, differing wealth effects may add to differing productivity effects. On average over the last 11 years, the five richest people on Earth have each been about 10 percent richer than the next richest person. If future em income ratios were like this current wealth ratio, then the best em worker could afford roughly an extra hour per day of leisure, or an additional six minutes per hour.

Fourth, competition probably does not take the strongest possible form, and the best few ems can probably coordinate to some extent. For example, if the best two em clans coordinate completely on wages, but compete strongly with the third best clan, then instead of the best and second best taking 20 and zero minutes of extra leisure per day, they could take 30 and 10 extra minutes, respectively.

Plausibly then, the best em workers can afford to take an additional two to six minutes of leisure per hour of work in a ten hour work day, in addition to the over six minutes per hour of break needed for maximum productivity.

GD Star Rating
Tagged as: ,