MacAskill on Value Lock-In

Will MacAskill has a new book out today, What We Owe The Future, most of which I agree with, even if that doesn’t exactly break new ground. Yes, the future might be very big, and that matters a lot, so we should be willing to do a lot to prevent extinction, collapse, or stagnation. I hope his book induces more careful future analysis, such as I tried in Age of Em. (FYI, MacAskill suggested that book’s title to me.) I also endorse his call for more policy and institutional experimentation. But, as is common in book reviews, I now focus on where I disagree.

Aside from the future being important, MacAskill main concern in his book is “value lock-in”, by which he means a future point in time when the values that control actions stop changing. But he actually mixes up two very different processes by which this result might arise. First, an immortal power with stable values might “take over the world”, and prevent deviations from its dictates. Second, in a stable universe decentralized competition between evolving entities might pick out some most “fit” values to be most common.

MacAskill’s most dramatic predictions are about this first “take over” process. He claims that the next century or so is the most important time in all of human history:

We hold the entire future in our hands. … By choosing wisely, we can be pivotal in putting humanity on the right course. … The values that humanity adopts in the next few centuries might shape the entire trajectory of the future. … Whether the future is government by values that are authoritarian or egalitarian, benevolent or sadistic, exploratory or rigid, might well be determined by what happens this century.

His reason: we will soon create AGI, or ems, who, being immortal, have forever stable values. Some org will likely use AGI to “take over the world”, and freeze in their values forever:

Advanced artificial intelligence could enable those in power to to lock in their values indefinitely. … Since [AGI] software can be copied with high fidelity, an AGI can survive changes in the hardware instantiating it. AGI agents are potentially immortal. These two features of AGI – potentially rapid technological progress and in-principle immortality – combine to make value lock-in a real possibility. …

Using AGI, there are a number of ways that people could extend their values much farther into the future than ever before. First, people may be able to create AGI agents with goals closely assigned with their own which would act on their behalf. … [Second,] the goals of an AGI could be hard-coded: someone could carefully specify what future white want to see and ensure that the AGI aims to achieve it. … Third, people could potentially “upload”. …

International organizations or private actors may be able to leverage AGI to attain a level of power not seen since the days of the East India Company, which in effect ruled large areas of India. …

A single set of values could emerge. …The ruling ideology could in principle persist as long as civilization does. AGI systems could replicate themselves as many times as they wanted, just as easily as we can replicate software today. They would be immortal, freed from the biological process of aging, able to create back-ups of themselves and copy themselves onto new machines. … And there would not longer be competing value systems that could dislodge the status quo. …

Bostrom’s book Superintelligence. The scenario most closely associated with that book is one in which a single AI agent … quickly developing abilities far greater than the abilities of all of humanity combined. … It would therefore be incentivize to take over the world. … Recent work has looked at a broader range of scenarios. The move from subhuman intelligence to super intelligence need not be ultrafast or discontinuous to post a risk. And it need not be a single AI that takes over; it could be many. …

Values could become even more persistent in the future if a single value system were to become global dominant. If so, then the absence of conflict and competition would remove one reason for change in values over time. Conquest is the most dramatic pathway … and it may well be the most likely.

Now mere immortality seems far from sufficient to create either value stability or a takeover. On takeover, immortality is insufficient. Not only is a decentralized world of competing immortals easy to imagine, but in fact until recently individual bacteria, who very much compete, were thought to be immortal.

On values, immortality also seems far from sufficient to induce stable values. Human organizations like firms, clubs, cities, and nations seem to be roughly immortal, and yet their values often greatly change. Individual humans change their values over their lifetimes. Computer software is immortal, and yet its values often change, and it consistently rots. Yes, as I mentioned in my last post, some imagine that AGIs have a special value modularity that can ensure value stability. But we have many good reasons to doubt that scenario.

Thus MacAskill must be positing that a power who somehow manages to maintain stable values takes over and imposes its will everywhere forever. Yet the only scenario he points to that seems remotely up to this task is Bostrom’s foom scenario. MacAskill claims that other scenarios are also relevant, but doesn’t even try to show how they could produce this result. For reasons I’ve given many times before, I’m skeptical of foom-like scenarios.

Furthermore, let me note that even if one power came to dominate Earth’s civilization for a very long time, it would still have to face competition from other grabby aliens in roughly a billion years. If so, forever just isn’t at issue here.

While MacAskill doesn’t endorse any regulations to deal with this stable-AGI-takes-over scenario, he does endorse regulations to deal with the other path to value stability: evolution. He wants civilization to create enough of a central power that it could stop change for a while, and also limit competition between values.

The theory of cultural evolution explains why many moral changes are contingent. … the predominant culture tends to entrench itself. … results in a world increasingly dominated by cultures with traits that encourage and enable entrenchment and thus persistence. …

If we don’t design our institutions to govern this transition well – preserving a plurality of values and the possibility of desirable moral progress. …

A second way for a culture to become more powerful is immigration [into it]. … A third way in which a cultural trait can gain influence is if it gives one group greater ability to survive or thrive in a novel environment. … A final way in which one culture can outcompete another is via population growth. … If the world converged on a single value system, there would be much less pressure on those values to change over time.

We should try to ensure that we have made as much moral progress as possible before any point of lock-in. … As an ideal, we could aim for what we could call the long reflection: a stable state of the world in which we are safe from calamity and can reflect on and debate the nature of the good life, working out what the more flourishing society would be. … It would therefore be worth spending many centuries to ensure that we’ve really figured things out before taking irreversible actions like locking in values or spreading across the stars. …

We would need to keep our options open as much as possible … a reason to prevent smaller-scale lock-ins … would favor political experimentation – increasing cultural and political diversity, if possible. …

That one society has greater fertility than another or exhibits faster economic growth does not imply that society is morally superior. In contrast, the most important mechanisms for improving our moral views are reason, reflection, and empathy, and the persuasion of others based on those mechanisms. … Certain forms of free speech would therefore be crucial to enable better ideas to spread. …

International norms or laws preventing any single country from becoming too populous, just as anti-trust regulation prevents any single company from dominating a market. … The lock-in paradox. We need to lock-in some institutions and ideas in order to prevent a more thorough-going lock-in of values. … If we wish to avoid the lock-in of bad moral views, an entirely laissez-faire approach would not be possible; over time, the forces of cultural evolution would dictate how the future goes, and the ideologies that lead to the greatest military powered that try to eliminate their competition would suppress all others.

I’ve recently described my doubts that expert deliberation has been a large force in value change so far. So I’m skeptical that will be a large force in the future. And the central powers (or global mobs) sufficient to promote a long reflection, or to limit nations competing, seem to risk creating value stability via the central dominance path discussed above. MacAskill doesn’t even consider this kind of risk from his favored regulations.

While competition may produce a value convergence in the long run, my guess is that convergence will happen a lot faster if we empower central orgs or mobs to regulate competition. I think that a great many folks prefer that latter scenario because they believe we know what are the best values, and fear that those values would not win an evolutionary competition. So they want to lock in current values via regs to limit competition and value change.

To his credit, MacAskill is less confident that currently popular values are in fact the best values. And his favored solution of more deliberation probably would’t hurt. I just don’t think he realizes just how dangerous are central powers able to regulate to promote deliberation and limit competition. And he seems way too confident about the chance of anything like foom soon.

GD Star Rating
loading...
Tagged as: ,
Trackback URL: