School Is To Submit

Most animals in the world can’t be usefully domesticated. This isn’t because we can’t eat their meat, or feed them the food they need. It is because all animals naturally resist being dominated. Only rare social species can let a human sit in the role of dominant pack animal whom they will obey, and only if humans do it just right.

Most nations today would be richer if they had long ago just submitted wholesale to a rich nation, allowing that rich nation to change their laws, customs, etc., and just do everything their way. But this idea greatly offends national and cultural pride. So nations stay poor.

When firms and managers from rich places try to transplant rich practices to poor places, giving poor place workers exactly the same equipment, materials, procedures, etc., one of the main things that goes wrong is that poor place workers just refuse to do what they are told. They won’t show up for work reliably on time, have many problematic superstitions, hate direct orders, won’t accept tasks and roles that that deviate from their non-work relative status with co-workers, and won’t accept being told to do tasks differently than they had done them before, especially when new ways seem harder. Related complaints are often made about the poorest workers in rich societies; they just won’t consistently do what they are told. It seems pride is a big barrier to material wealth.

The farming mode required humans to swallow many changes that didn’t feel nice or natural to foragers. While foragers are fiercely egalitarian, farmers are dominated by kings and generals, and have unequal property and classes. Farmers work more hours at less mentally challenging tasks, and get less variety via travel. Huge new cultural pressures, such as religions with moralizing gods, were needed to turn foragers into farmers.

But at work farmers are mostly autonomous and treated as the equal of workers around them. They may resent having to work, but adults are mostly trusted to do their job as they choose, since job practices are standardized and don’t change much over time. In contrast, productive industrial era workers must accept more local domination and inequality than would most farmers. Industry workers have bosses more in their face giving them specific instructions, telling them what they did wrong, and ranking them explicitly relative to their previous performance and to other nearby workers. They face more ambiguity and uncertainty about what they are supposed to do and how.

How did the industrial era get at least some workers to accept more domination, inequality, and ambiguity, and why hasn’t that worked equally well everywhere? A simple answer I want to explore in this post is: prestigious schools.

While human foragers are especially averse to even a hint of domination, they are also especially eager to take “orders” via copying the practices of prestigious folks. Humans have a uniquely powerful capacity for cultural evolution exactly because we are especially eager and able to copy what prestigious people do. So if humans hate industrial workplace practices when they see them as bosses dominating, but love to copy the practices of prestigious folks, an obvious solution is to habituate kids into modern workplace practices in contexts that look more like the latter than the former.

In his upcoming book, The Case Against Education, my colleague Bryan Caplan argues that school today, especially at the upper levels, functions mostly to help students signal intelligence, conscientiousness, and conformity to modern workplace practices. He says we’d be better off if kids did this via early jobs, but sees us as having fallen into an unfortunate equilibrium wherein individuals who try that seem non-conformist. I agree with Bryan that, compared with the theory that older students mostly go to school to learn useful skills, signaling better explains the low usefulness of school subjects, low transfer to other tasks, low retention of what is taught, low interest in learning relative to credentials, big last-year-of-school gains, and student preferences for cancelled classes.

My main problem with Caplan’s story so far (he still has time to change his book) is the fact that centuries ago most young people did signal their abilities via jobs, and the school signaling system has slowly displaced that job signaling system. Pressures to conform to existing practices can’t explain this displacement of a previous practice by a new practice. So why did signaling via school did win out over signaling via early jobs?

Like early jobs, school can have people practice habits that will be useful in jobs, such as showing up on time, doing what you are told even when that is different from what you did before, figuring out ambiguous instructions, and accepting being frequently and publicly ranked relative to similar people. But while early jobs threaten to trip the triggers than make most animals run from domination, schools try to frame a similar habit practice in more acceptable terms, as more like copying prestigious people.

Forager children aren’t told what to do; they just wander around and do what they like. But they get bored and want to be respected like adults, so eventually they follow some adults around and ask to be shown how to do things. In this process they sometimes have to take orders, but only until they are no longer novices. They don’t have a single random boss they don’t respect, but can instead be trained by many adults, can select them to be the most prestigious adults around, and can stop training with each when they like.

Schools work best when they set up an apparently similar process wherein students practice modern workplaces habits. Start with prestigious teachers, like the researchers who also teach at leading universities. Have students take several classes at at a time, so they have no single “boss” who personally benefits from their following his or her orders. Make class attendance optional, and let students pick their classes. Have teachers continually give students complex assignments with new ambiguous instructions, using the excuse of helping students to learn new things. Have lots of students per teacher, to lower costs, to create excuses for having students arrive and turn in assignments on time, and to create social proof that other students accept all of this. Frequently and publicly rank student performance, using the excuse of helping students to learn and decide which classes and jobs to take later. And continue the whole process well into adulthood, so that these habits become deeply ingrained.

When students finally switch from school to work, most will find work to be similar enough to transition smoothly. This is especially true for desk professional jobs, and when bosses avoid giving direct explicit orders. Yes, workers now have one main boss, and can’t as often pick new classes/jobs. But they won’t be publicly ranked and corrected nearly as often as in school, even though such things will happen far more often than their ancestors would have tolerated. And if their job ends up giving them prestige, their prior “submission” to prestigious teachers will seem more appropriate.

This point of view can help explain how schools could help workers to accept habits of modern workplaces, and thus how there could have been selection for societies that substituted schools for early jobs or other child activities. It can also help explain unequal gains from school; some kinds of schools should be less effective than others. For example, teachers might not be prestigious, teachers may fail to show up on time to teach, teacher evaluations might correlate poorly with student performance, students might not have much choice of classes, school tasks might diverge too far from work tasks, students may not get prestigious jobs, or the whole process might continue too long into adulthood, long after the key habituation has been achieved.

In sum, while students today may mostly use schools to signal smarts, drive, and conformity, we need something else to explain how school displaced early work in this signaling role. One plausible story is that schools habituate students in modern workplace habits while on the surface looking more like prestigious forager teachers than like the dominating bosses that all animals are primed to resist. But this hardly implies that everything today that calls itself a school is equally effective at producing this benefit.

GD Star Rating
loading...
Tagged as: , , , ,

Super-Voting Scenario

I recently posted on a hypothetical “kilo-vote” scenario intended to help show that most of us don’t vote mainly to influence who wins the election. However, the ability of any given scenario to convince a reader of such a result depends on many details of the scenario, and of reader beliefs about behavior. So on reflection, I’ve come up with a new scenario I think can persuade more people, because in it fewer things change from the prototypical voting scenario.

Imagine that polls stayed open for a month before the election deadline, and that a random one percent of voters were upgraded to “super-voters,” who can privately vote up to twenty times, as long as they wait at least an hour between votes. When a super-voter votes all twenty times, their votes are doubled, and counted as forty votes. “Privately” means no one else ever knows that this person was a super-voter. (Yes that could be hard to achieve, but just assume that it is achieved somehow.)

To a voter who cares mainly about picking the election winner, and who casts only a tiny fraction of the votes, the value of voting is proportional to their number of votes. Twice the votes gives twice the value. If such a person votes when they are an ordinary voter, then they should be greatly tempted to vote twenty times as a super-voter; their costs aren’t much more than twenty times their costs from voting once, yet for that effort they get forty votes.

I feel pretty sure that most of the people assigned to super-voter status would not in fact vote twenty times. Yes I haven’t tested this, but I’d be willing to bet on it. Most voters care a lot more about seeming to have done their duty than they do about maximizing any new opportunities that arise from being assigned super-voter status. So most super-voters would think they’d done their duty with their first vote. After all, if voting once is good enough for ordinary voters who are not assigned to super-voter status, why shouldn’t that be good enough for super-voters as well?

GD Star Rating
loading...
Tagged as:

How Good 99% Brains?

Software systems are divided into parts, and we have two main ways to measure the fraction of a system that each part represents: lines of code, and resources used. Lines (or bits) of code is a rough measure of the amount of understanding that a part embodies, i.e., how hard it is to create, modify, test, and maintain. For example, a system that is more robust or has a wider range of capacities typically has more lines of code. Resources used include processors, memory, and communication between these items. Resources measure how much it costs to use each part of the system. Systems that do very narrow tasks that are still very hard typically take more resources.

Human brains can be seen as software systems composed of many parts. Each brain occupies a spatial volume, and we can measure the fraction of each brain part via the volume it takes up. People sometimes talk about measuring our understanding of the brain in terms of the fraction of brain volume that is occupied by systems we understand. For example, if we understand parts that take up a big fraction of brain volume, some are tempted to say we are a big fraction of the way toward understanding the brain.

However, using the software analogy, brain volume seems usually to correspond more closely to resources used than to lines of code. For example, brain volumes seem to have roughly similar levels of activity, which isn’t what we’d expect if they corresponded more to lines of code than to resources used.

Consider two ways that we might shrink a software system: we might cut 1% of the lines of code, or 1% of the resources used. If we cut 1% of the resources used via cutting the lines of code that use the fewest resources, we will likely severely limited the range of abilities of a broadly capable system. On the other hand, if we cut the most modular 1% of the lines of code, that system’s effectiveness and range of abilities will probably not fall by remotely as much.

So there can be a huge variation in the effective lines of code corresponding to each brain region, and the easiest parts to understand are probably those with the fewest lines of code. So understanding the quarter of brain volume that is easiest to understand might correspond to understanding only 1% or less of lines of code. And continuing along that path we might understand 99% of brain volume and still be a very long way from being able to create a system that is as productive or useful as a full human brain.

This is why I’m not very optimistic about creating human level AI before brain emulations. Yes, when we have nearly the ability to emulate a whole brain, we will have better data and simulations to help us understand brain parts. But the more brain parts there are to understand, the harder it will be to understand them all before brain emulation is feasible.

Those who expect AI-before-emulations tend to think that there just aren’t that many brain parts, i.e., that the brain doesn’t really embody very many lines of code. Even though the range of capacities of a human brain, even a baby brain, seems large compared to most known software systems, these people think that this analogy is misleading. They guess that in fact there is a concise powerful theory of intelligence that will allow huge performance gains once we understand it. In contrast, I see the analogy to familiar software as more relevant; the vast capacity of human brains suggests they embody the equivalent of a great many lines of code. Content matters more than architecture.

GD Star Rating
loading...

Me in London, Oxford, Istanbul

I leave Friday on a nine day trip to give six talks, all but one on Age of Em:

GD Star Rating
loading...
Tagged as:

What Price Kilo-Votes?

Imagine that at every U.S. presidential election, the system randomly picked one random U.S. voter and asked them to pay a fee to become a “kilo-voter.” Come election day, if there is a kilo-voter then the election system officially tosses sixteen fair coins. If all sixteen coins come up heads, the kilo-voter’s vote decides the election. If not, or if there is no kilo-voter, the election is decided as usual via ordinary votes. The kilo-voter only gets to pick between Democrat and Republican nominees, and no one ever learns that they were the kilo-voter that year.

“Kilo voters” are so named because they have about a thousand times a chance of deciding the election as an ordinary voter does. In the 2008 U.S. presidential election the average voter had a one in sixty million chance of deciding who won the election. The chance that sixteen fair coins all come up heads is roughly a thousand times larger than this.

Consider: 1) How much is the typical voter willing to pay to become a kilo-voter? and 2) How much does it cost the typical voter, in time and trouble, to actually vote in a U.S. presidential election? As long as these numbers are both small compared to a voter’s wealth, then for a voter motived primarily by the chance to change the election outcome, these numbers should differ by at least a factor of one thousand.

For example, if it takes you at least a half hour to get to the voting booth and back, and to think beforehand about your vote, and if you make the average U.S. hourly wage of $20, then voting costs you at least $10. In this case you should be willing to pay at least $10,000 to become a super-voter, if you are offered the option. Me, I very much doubt that typical voters would pay $10,000 to become secret kilo-voters.

Yes, the 2008 election influenced the lives of 305 million U.S. residents, and someone who cared enough might pay a lot for a higher chance of deciding such an election. But typical voters would not pay a lot. Which suggests that the chance to decide the election is just not the main reason that they vote. The chance of being decisive actually doesn’t seem to matter remotely as much to typical voting behavior as it should to someone focused on changing outcomes. For example, states where voters have much higher chances of being decisive about the president don’t have much higher voter turnout rates, and turnout is actually lower in local and state elections where the chances of being decisive is higher.

My conclusion: we don’t mainly vote to change the outcome.

GD Star Rating
loading...
Tagged as: ,

Age of Em Update

My first book, The Age of Em: Work, Love, and Life When Robots Rule the Earth, is moving along toward its June 1 publication date (in UK, a few weeks later in US). A full book jacket is now available:

hanson_hb

Blurbs are also now available, from: Sean Carroll, Marc Andreessen, David Brin, Andrew McAfee, Erik Brynjolfsson, Matt Ridley, Hal Varian, Tyler Cowen, Vernor Vinge, Steve Fuller, Bryan Caplan, Gregory Benford, Kevin Kelly, Ben Goertzel, Tim Harford, Geoffrey Miller, Tim O’Reilly, Scott Aaronson, Ramez Naam, Hannu Rajaniemi, William MacAskill, Eliezer Yudkowsky, Zach Weinersmith, Robert Freitas, Neil Jacobstein, Ralph Merkle, and Michael Chwe.

Kindle and Audible versions are in the works, as is a Chinese translation.

I have a page that lists all my talks on the book, many of which I’ll also post about here at this blog.

Abstracts for each of the thirty chapters should be available to see within a few weeks.

GD Star Rating
loading...
Tagged as:

How Different AGI Software?

My ex-co-blogger Eliezer Yudkowsky recently made a Facebook post saying that recent AI Go progress confirmed his predictions from our foom debate. He and I then discussed this there, and I thought I’d summarize my resulting point of view here.

Today an individual firm can often innovate well in one of its products via a small team that keeps its work secret and shares little with other competing teams. Such innovations can be lumpy in the sense that gain relative to effort varies over a wide range, and a single innovation can sometimes make a big difference to product value.

However big lumps are rare; typically most value gained is via many small lumps rather than a few big ones. Most innovation comes from detailed practice, rather than targeted research, and abstract theory contributes only a small fraction. Innovations vary in their generality, and this contributes to the variation in innovation lumpiness. For example, a better washing machine can better wash many kinds of clothes.

If instead of looking at individual firms we look at nations as a whole, the picture changes because a nation is an aggregation of activities across a great many firm teams. While one firm can do well with a secret innovation team that doesn’t share, a big nation would hurt itself a lot by closing its borders to stop sharing with other nations. Single innovations make a much smaller difference to nations as a whole then they do to individual products. So nations grow much more steadily than do firms.

All of these patterns apply not just to products in general, but also to the subcategory of software. While some of our most general innovations may be in software, most software innovation is still made of many small lumps. Software that is broadly capable, such as a tool-filled operating system, is created by much larger teams, and particular innovations make less of a difference to its overall performance. Most software is created via tools that are shared with many other teams of software developers.

From an economic point of view, a near-human-level “artificial general intelligence” (AGI) would be a software system with a near-human level competence across almost the entire range of mental tasks that matter to an economy. This is a wide range, much more like scope of abilities found in a nation than those found in a firm. In contrast, an AI Go program has a far more limited range of abilities, more like those found in typical software products. So even if the recent Go program was made by a small team and embodies lumpy performance gains, it is not obviously a significant outlier relative to the usual pattern in software.

It seems to me that the key claim made by Eliezer Yudkowsky, and others who predict a local foom scenario, is that our experience in both ordinary products in general and software in particular is misleading regarding the type of software that will eventually contribute most to the first human-level AGI. In products and software, we have observed a certain joint distribution over innovation scope, cost, value, team size, and team sharing. And if that were also the distribution behind the first human-level AGI software, then we should predict that it will be made via a great many people in a great many teams, probably across a great many firms, with lots of sharing across this wide scope. No one team or firm would be very far in advance of the others.

However, the key local foom claim is that there is some way for small teams that share little to produce innovations with far more generality and lumpiness than these previous distributions suggests, perhaps due to being based more on math and basic theory. This would increase the chances that a small team could create a program that grabs a big fraction of world income, and keeps that advantage for an important length of time.

Presumably the basis for this claim is that some people think they see a different distribution among some subset of AI software, perhaps including machine learning software. I don’t see it yet, but the obvious way for them to convince skeptics like me is to create and analyze a formal dataset of software projects and innovations. Show us a significantly-deviating subset of AI programs with more economic scope, generality, and lumpiness in gains. Statistics from such an analysis could let us numerically estimate the chances of a single small team encompassing a big fraction of AGI software power and value.

That is, we might estimate the chances of local foom. Which I’ve said isn’t zero; I’ve instead just suggested that foom has gained too much attention relative to its importance.

GD Star Rating
loading...
Tagged as: , ,

Can’t Stop Lecturing

Imagine a not-beloved grade school teacher who seemed emotionally weak to his students, and was fastidious about where exactly everything was on his desk and in his classroom. If the students moved things around when the teacher wasn’t looking, this teacher would seem disrupted and give long boring lectures against such behavior. This sort of reaction might well encourage students to move things, just to get a rise out of the teacher.

Imagine a daughter who felt overly controlled and under considered by clueless parents, and who was attracted to and tempted to get involved with a particular “bad boy.” Imagine that these parents seemed visibly disturbed by this, and went out of their way to lecture her often about why bad boys are a bad idea, though never actually telling her anything she didn’t think she already knew. In such a case, this daughter might well be more tempted to date this bad boy, just to bother her parents.

Today a big chunk of the U.S. electorate feels neglected by a political establishment that they don’t especially respect, and is tempted to favor political bad boy Donald Trump. The main response of our many establishments, especially over the last few weeks, has been to constantly lecture everyone about how bad an idea this would be. Most of this lecturing, however, doesn’t seem to tell Trump supporters anything they don’t think they already know, and little of it acknowledges reasonable complaints regarding establishment neglect and incompetence.

By analogy with these other cases, the obvious conclusion is that all this tone-deaf sanctimonious lecturing will not actually help reduce interest in Trump, and may instead increase it. But surely an awful lot of our establishments must be smart enough to have figured this out. Yet the tsunami of lectures continues. Why?

A simple interpretation in all of these cases is that people typically care more about making sure they are seen to take a particular moral stance than they care about the net effect of their lectures on behavior. The teacher with misbehaving students cares more about showing everyone he has a valid complaint than he does about reducing misbehavior. The parents of a daughter dating a bad boy care more about showing they took the correct moral stance than they do about whether she actually dates him. And members of the political establishment today care more about making it clear that they oppose Trump than they do about actually preventing him from becoming president.

GD Star Rating
loading...
Tagged as: ,

Against DWIM Meta-Law

Smart capable personal assistants can be very useful. You give them vague and inconsistent instructions, and they “do what I mean” (DWIM), fixing your mistakes. If you empower them to control your interactions, you need less fear mistakes messing up your interactions.

But one thing a DWIM personal assistant can’t help you so much with is your choice of assistants. If assistants were empowered to use DWIM on your choice to fire them, they might tend to decide you didn’t really mean to fire them. So if you are to have an effective choice of assistants, and thus effective competition among potential assistants, then those same assistants can’t protect you much from possible mistakes in your meta-choices regarding assistants. They can protect you from other choices, but not that choice.

The same applies to letting people choose what city or nation to live in. When people live in a nation then that national government can use regulation to protect them from making many mistakes. For example, it can limit their legally available options of products, services, and contracts. But if people are to have an effective choice to change governments by changing regions, then such governments can’t use regulation much to protect people from mistakes regarding region choice. After all, a government authorized to declare your plan to move away from it to be a mistake can stop you from rejecting it.

Similarly we can elect politicians who pass laws to protect us from many mistakes. But if we are to have an effective choice of politicians to represent us, then they can’t protect us much from bad choices of politicians to represent us. We can’t let our current elected leaders much regulate who we can elect to replace them, if we are to be able to actually replace them.

I’ve long been intrigued by the idea of private law, wherein people can stay in the same place but contract with different legal systems, which then set the rules regarding their legal interactions with others. Such rules might in effect change the laws of tort, crime, marriage, etc. that people live under. And so such competition between private laws might push the law to evolve toward more efficient laws.

One of the things that legal systems tend to do is to protect people from mistakes. For example, contract law won’t enforce contracts it sees as mistakes, and it fills in contract holes it sees resulting from laziness. Law is often DWIM law. Which can be great when you trust your law to choose well. But if one is to have an effective choice of private law, and real competition for that role, then one’s current law shouldn’t be able to overrule one’s choice of a new law. Instead, one’s choice of a private legal system, like one’s choice of nation, needs to be a simple clear choice where one is not much protected from mistakes.

Today we don’t in fact have such private law, because our standard legal system won’t enforce contracts we sign that declare our intent to use different legal systems. To achieve private law, we’d need to change this key feature of our standard legal system.

Your choice to change nations, either for temporary travel or for permanent moves, can be a big mistake. It might result from temporary mood fluctuations, or from misunderstandings about the old nation or the new. Nevertheless we have little regulation of such choices. Instead individuals are mostly fully exposes to their possible mistakes. For example, while Europe is heavily regulated in general, European teens today can decide to go join ISIS, even when many others greatly regret such choices. We disapprove of nations that prevent people from leaving because that cuts competition between nations to serve people.

Similarly, if we want completion between legal systems without forcing people to move, we’ll have to change our law to accept our not protecting people from bad choices of legal systems. There will have to be a simple clear act by which one chooses a law, a choice not much subject to legal review and reversal. We’d want to encourage people to take such choices seriously, but then to accept the choices they make. Freedom of choice requires a freedom to make mistakes. For big choices, those can be big mistakes.

GD Star Rating
loading...
Tagged as: , ,

Scared, Sad, Angry, Bitter

These four emotions: scared, sad, angry, and bitter, all suggest that one has suffered or will suffer a loss. So all of them might inspire empathy and help from others. But they don’t do so equally. Consider the selfish costs of expressing empathy for these four emotions.

While a scared person hasn’t actually suffered a loss yet, the other kinds of feelings indicate that an actual loss has been suffered. So the scared person is not yet a loser, while the others are losers. When there are costs with associating with losers, those costs are lowest for the scared. For example, if it takes real resources to help someone who has suffered a loss, the scared person is less likely to need such resources.

People who are angry or bitter blame particular other people for their loss. So by expressing empathy with or helping such people, you risk getting involved in conflicts with those other people. In contrast, helping people who are just sad less risks getting you into conflicts.

People who are angry tend to think they have a substantial chance of winning a conflict with those they blame for their loss. Anger is a more visible emotion that drives one more toward overt conflict. Angry people are visibly trying to recruit others to their fight.

In contrast, bitter people tend to think they have little chance of winning a overt conflict, at least for now. So bitter people tend to fume in private, waiting for their chance to hit back unseen. If you help a bitter person, you may get blamed when their hidden attacks are uncovered, and your support may tempt them to become angry and start an overt fight. So by helping a bitter person, you are more likely to be on the losing end of a conflict.

These considerations suggest that our cost of empathizing with and helping people with these emotions increases in this order: scared, sad, angry, and bitter. And this also seems to describe the order in which we actually feel less empathy; we feel less empathy when its costs are higher.

Note that this same order also describes who has suffered a larger loss, on average. Scared people expect to suffer the smallest loss, while bitter people suffer the largest loss. (Ask yourself which emotion you’d rather feel.) So our willingness to express empathy with those who suffer a loss is inverse to the loss they suffer. We empathize the most with those who suffer the least. Because that is cheapest.

Thanks to Carl Shulman for pointing out to me the social risks of helping bitter folk, relative to sad folk.

Added 18Feb: Interestingly, many lists of emotions don’t include bitterness or an equivalent. It is as if we’d like to pretend it just doesn’t exist.

GD Star Rating
loading...
Tagged as: ,