Search Results for: foom

Foom Update

To extend our reach, we humans have built tools, machines, firms, and nations. And as these are powerful, we try to maintain control of them. But as efforts to control them usually depend on their details, we have usually waited to think about how to control them until we had concrete examples in front of us. In the year 1000, for example, there wasn’t much we could do to usefully think about how to control most things that have only appeared in the last two centuries, such as cars or international courts.

Someday we will have far more powerful computer tools, including “advanced artificial general intelligence” (AAGI), i.e., with capabilities even higher and broader than those of individual human brains today. And some people today spend substantial efforts today worrying about how we will control these future tools. Their most common argument for this unusual strategy is “foom”.

That is, they postulate a single future computer system, initially quite weak and fully controlled by its human sponsors, but capable of action in the world and with general values to drive such action. Then over a short time (days to weeks) this system dramatically improves (i.e., “fooms”) to become an AAGI far more capable even than the sum total of all then-current humans and computer systems. This happens via a process of self-reflection and self-modification, and this self-modification also produces large and unpredictable changes to its effective values. They seek to delay this event until they can find a way to prevent such dangerous “value drift”, and to persuade those who might initiate such an event to use that method.

I’ve argued at length (1 2 3 4 5 6 7) against the plausibility of this scenario. Its not that its impossible, or that no one should work on it, but that far too many take it as a default future scenario. But I haven’t written on it for many years now, so perhaps it is time for an update. Recently we have seen noteworthy progress in AI system demos (if not yet commercial application), and some have urged me to update my views as a result.

The recent systems have used relative simple architectures and basic algorithms to produce models with enormous numbers of parameters from very large datasets. Compared to prior systems, these systems have produced impressive performance on an impressively wide range of tasks. Even though they are still quite far from displacing humans in any substantial fraction of their current tasks.

For the purpose of reconsidering foom, however, the key things to notice are: (1) these systems have kept their values quite simple and very separate from the rest of the system, and (2) they have done basically zero self-reflection or self-improvement. As I see AAGI as still a long way off, the features of these recent systems can only offer weak evidence regarding the features of AAGI.

Even so, recent developments offer little support for the hypothesis that AAGI will be created soon via the process of self-reflection and self-improvement, or for the hypothesis that such a process risks large “value drifts”. These current ways that we are now moving toward AAGI just don’t look much like the foom scenario. And I don’t see them as saying much about whether ems or AAGI will appear first.

Again, I’m not saying foom is impossible, just that it looks unlikely, and that recent events haven’t made it seem moreso.

These new systems do suggest a substantial influence of architecture on system performance, though not obviously at a level out of line with that in most prior AI systems. And note that the abilities of the very best systems here are not that much better than that of the 2nd and 3rd best systems, arguing weakly against AAGI scenarios where the best system is vastly better.

GD Star Rating
loading...
Tagged as:

Tegmark’s Book of Foom

Max Tegmark says his new book, Life 3.0, is about what happens when life can design not just its software, as humans have done in Life 2.0, but also its hardware:

Life 1.0 (biological stage) evolves its hardware and software
Life 2.0 (cultural stage) evolves its hardware, designs much of its software
Life 3.0 (technological stage): designs its hardware and software ..
Many AI researchers think that Life 3.0 may arrive during the coming century, perhaps even during our lifetime, spawned by progress in AI. What will happen, and what will this mean for us? That’s the topic of this book. (29-30)

Actually, its not. The book says little about redesigning hardware. While it says interesting things on many topics, its core is on a future “singularity” where AI systems quickly redesign their own software. (A scenario sometimes called “foom”.)

The book starts out with a 19 page fictional “scenario where humans use superintelligence to take over the world.” A small team, apparently seen as unthreatening by the world, somehow knows how to “launch” a “recursive self-improvement” in a system focused on “one particular task: programming AI Systems.” While initially “subhuman”, within five hours it redesigns its software four times and becomes superhuman at its core task, and so “could also teach itself all other humans skills.”

After five more hours and redesigns it can make money by doing half of the tasks at Amazon Mechanical Turk acceptably well. And it does this without having access to vast amounts of hardware or to large datasets of previous performance on such tasks. Within three days it can read and write like humans, and create world class animated movies to make more money. Over the next few months it goes on to take over the news media, education, world opinion, and then the world. It could have taken over much faster, except that its human controllers were careful to maintain control. During this time, no other team on Earth is remotely close to being able to do this.

Tegmark later explains: Continue reading "Tegmark’s Book of Foom" »

GD Star Rating
loading...
Tagged as: , ,

Foom Justifies AI Risk Efforts Now

Years ago I was honored to share this blog with Eliezer Yudkowsky. One of his main topics then was AI Risk; he was one of the few people talking about it back then. We debated this topic here, and while we disagreed I felt we made progress in understanding each other and exploring the issues. I assigned a much lower probability than he to his key “foom” scenario.

Recently AI risk has become something of an industry, with far more going on than I can keep track of. Many call working on it one of the most effectively altruistic things one can possibly do. But I’ve searched a bit and as far as I can tell that foom scenario is still the main reason for society to be concerned about AI risk now. Yet there is almost no recent discussion evaluating its likelihood, and certainly nothing that goes into as much depth as did Eliezer and I. Even Bostrom’s book length treatment basically just assumes the scenario. Many seem to think it obvious that if one group lets one AI get out of control, the whole world is at risk. It’s not (obvious).

As I just revisited the topic while revising Age of Em for paperback, let me try to summarize part of my position again here. Continue reading "Foom Justifies AI Risk Efforts Now" »

GD Star Rating
loading...
Tagged as: , ,

I Still Don’t Get Foom

Back in 2008 my ex-co-blogger Eliezer Yudkowsky and I discussed his “AI foom” concept, a discussion that we recently spun off into a book. I’ve heard for a while that Nick Bostrom was working on a book elaborating related ideas, and this week his Superintelligence was finally available to me to read, via Kindle. I’ve read it now, along with a few dozen reviews I’ve found online. Alas, only the two reviews on GoodReads even mention the big problem I have with one of his main premises, the same problem I’ve had with Yudkowsky’s views. Bostrom hardly mentions the issue in his 300 pages (he’s focused on control issues).

All of which makes it look like I’m the one with the problem; everyone else gets it. Even so, I’m gonna try to explain my problem again, in the hope that someone can explain where I’m going wrong. Here goes.

“Intelligence” just means an ability to do mental/calculation tasks, averaged over many tasks. I’ve always found it plausible that machines will continue to do more kinds of mental tasks better, and eventually be better at pretty much all of them. But what I’ve found it hard to accept is a “local explosion.” This is where a single machine, built by a single project using only a tiny fraction of world resources, goes in a short time (e.g., weeks) from being so weak that it is usually beat by a single human with the usual tools, to so powerful that it easily takes over the entire world. Yes, smarter machines may greatly increase overall economic growth rates, and yes such growth may be uneven. But this degree of unevenness seems implausibly extreme. Let me explain. Continue reading "I Still Don’t Get Foom" »

GD Star Rating
loading...
Tagged as: , , ,

Foom Debate, Again

My ex-co-blogger Eliezer Yudkowsky last June:

I worry about conversations that go into “But X is like Y, which does Z, so X should do reinterpreted-Z”. Usually, in my experience, that goes into what I call “reference class tennis” or “I’m taking my reference class and going home”. The trouble is that there’s an unlimited number of possible analogies and reference classes, and everyone has a different one. I was just browsing old LW posts today (to find a URL of a quick summary of why group-selection arguments don’t work in mammals) and ran across a quotation from Perry Metzger to the effect that so long as the laws of physics apply, there will always be evolution, hence nature red in tooth and claw will continue into the future – to him, the obvious analogy for the advent of AI was “nature red in tooth and claw”, and people who see things this way tend to want to cling to that analogy even if you delve into some basic evolutionary biology with math to show how much it isn’t like intelligent design. For Robin Hanson, the one true analogy is to the industrial revolution and farming revolutions, meaning that there will be lots of AIs in a highly competitive economic situation with standards of living tending toward the bare minimum, and this is so absolutely inevitable and consonant with The Way Things Should Be as to not be worth fighting at all. That’s his one true analogy and I’ve never been able to persuade him otherwise. For Kurzweil, the fact that many different things proceed at a Moore’s Law rate to the benefit of humanity means that all these things are destined to continue and converge into the future, also to the benefit of humanity. For him, “things that go by Moore’s Law” is his favorite reference class.

I can have a back-and-forth conversation with Nick Bostrom, who looks much more favorably on Oracle AI in general than I do, because we’re not playing reference class tennis with “But surely that will be just like all the previous X-in-my-favorite-reference-class”, nor saying, “But surely this is the inevitable trend of technology”; instead we lay out particular, “Suppose we do this?” and try to discuss how it will work, not with any added language about how surely anyone will do it that way, or how it’s got to be like Z because all previous Y were like Z, etcetera. (more)

When we shared this blog, Eliezer and I had a long debate here on his “AI foom” claims. Later, we debated in person once. (See also slides 34,35 of this 3yr-old talk.) I don’t accept the above as characterizing my position well. I’ve written up a summaries before, but let me try again, this time trying to more directly address the above critique.

Eliezer basically claims that the ability of an AI to change its own mental architecture is such a potent advantage as to make it likely that a cheap unnoticed and initially low ability AI (a mere “small project machine in a basement”) could without warning over a short time (e.g., a weekend) become so powerful as to be able to take over the world.

As this would be a sudden big sustainable increase in the overall growth rate in the broad capacity of the world economy, I do find it useful to compare to compare this hypothesized future event to the other pasts events that produce similar outcomes, namely a big sudden sustainable global broad capacity rate increase. The last three were the transitions to humans, farming, and industry.

I don’t claim there is some hidden natural law requiring such events to have the same causal factors or structure, or to appear at particular times. But I do think these events suggest a useful if weak data-driven prior on the kinds of factors likely to induce such events, on the rate at which they occur, and on their accompanying inequality in gains. In particular, they tell us that such events are very rare, that over the last three events gains have been spread increasingly equally, and that these three events seem mainly due to better ways to share innovations.

Eliezer sees the essence of his scenario as being a change in the “basic” architecture of the world’s best optimization process, and he sees the main prior examples of this as the origin of natural selection and the arrival of humans. He also sees his scenario as differing enough from the other studied growth scenarios as to make analogies to them of little use.

However, since most global bio or econ growth processes can be thought of as optimization processes, this comes down to his judgement on what counts as a “basic” structure change, and on how different such scenarios are from other scenarios. And in my judgement the right place to get and hone our intuitions about such things is our academic literature on global growth processes.

Economists have a big literature on processes by which large economies grow, increasing our overall capacities to achieve all the things we value. There are of course many other growth literatures, and some of these deal in growths of capacities, but these usually deal with far more limited systems. Of these many growth literatures it is the economic growth literature that is closest to dealing with the broad capability growth posited in a fast growing AI scenario.

It is this rich literature that seems to me the right place to find and hone our categories for thinking about growing broadly capable systems. One should review many formal theoretical models, and many less formal applications of such models to particular empirical contexts, collecting “data” points of what is thought to increase or decrease growth of what in what contexts, and collecting useful categories for organizing such data points.

With such useful categories in hand one can then go into a new scenario such as AI foom and have a reasonable basis for saying how similar that new scenario seems to old scenarios, which old scenarios it seems most like if any, and which parts of that new scenario are central vs. peripheral. Yes of course if this new area became mature it could also influence how we think about other scenarios.

But until we actually see substantial AI self-growth, most of the conceptual influence should go the other way. Relying instead primarily on newly made up categories and similarity maps between them, concepts and maps which have not been vetted or honed in dealing with real problems, seems to me a mistake. Yes of course a new problem may require one to introduce some new concepts to describe it, but that is hardly the same as largely ignoring old concepts.

So, I fully grant that the ability of AIs to intentionally change mind designs would be a new factor in the world, and it could make a difference for AI ability to self-improve. But while the history of growth over the last few million years has seen many dozens of factors come and go, or increase and decrease in importance, it has only seen three events in which overall growth rates greatly increased suddenly and sustainably. So the mere addition of one more factor seems unlikely to generate foom, unless our relevant categories for growth causing factors suggest that this factor is unusually likely to have such an effect.

This is the sense in which I long ago warned against over-reliance on “unvetted” abstractions. I wasn’t at all trying to claim there is one true analogy and all others are false. Instead, I argue for preferring to rely on abstractions, including categories and similarity maps, that have been found useful by a substantial intellectual community working on related problems. On the subject of an AI growth foom, most of those abstractions should come from the field of economic growth.

GD Star Rating
loading...
Tagged as: , , , ,

A History Of Foom

I had occasion recently to review again the causes of the few known historical cases of sudden permanent increases in capacity growth rates in broadly capable systems: humans, farmers, and industry. For each of these transitions, a large number of changes appeared at roughly the same time. The problem is to distinguish the key change that enabled all the other changes.

For humans, it seems that the most proximate cause of faster human than non-human growth was culture – a strong ability to reliably copy the behavior of others allowed useful behaviors to accumulate via a non-genetic path. A strong ritual ability was clearly key. It also helped to have language, to live in large bands friendly with neighboring bands, to cook and travel widely, etc., but these may not have been essential. Chimps are pretty good at culture compared to most animals, just not good enough to support sustained cultural growth.

For farming, it seems to me that the key was the creation of long range trade routes along which domesticated seeds and animals could move. It was the accumulation of domestication innovations that most fundamentally caused the growth in farmers, and it was these long range trade routes that allowed innovations to accumulate so much faster than they had for foragers.

How did farming enable long range trade? Since farmers stay in one place, they are easier to find, and can make more use of heavy physical capital. Higher density living requires less travel distance for trade. But perhaps most important, transferable domesticated seeds and animals embodied innovations directly, without requiring detailed copying of behavior. They were also useful in a rather wide range of environments.

On industry, the first burst of productivity at the start of the industrial revolution was actually in the farming sector, and had little to do with machines. It appears to have come from “amateur scientist” farmers doing lots of little local trials about what worked best, and then communicating them to farmers elsewhere who grew similar crops in similar environments, via “scientific society” like journals and meetings. These specialist networks could spread innovations much faster than could trade in seeds and animals.

Applied to machines, specialist networks could spread innovation even faster, because machine functioning depended even less on local context, and because innovations could be embodied directly in machines without the people who used those machines needing to learn them.

So far, it seems that the main causes of growth rate increases were better ways to share innovations. This suggests that when looking for what might cause future increases in growth rates, we also seek better ways to share innovations.

Whole brain emulations might be seen as allowing mental innovations to be moved more easily, by copying entire minds instead of having one mind train or teach another. Prediction and decision markets might also be seen as better ways to share info about which innovations are likely to be useful where. In what other ways might we dramatically increase our ability to share innovations?

GD Star Rating
loading...
Tagged as: , ,

Emulations Go Foom

Let me consider the AI-foom issue by painting a (looong) picture of the AI scenario I understand best, whole brain emulations, which I’ll call “bots.”  Here goes.

When investors anticipate that a bot may be feasible soon, they will estimate their chances of creating bots of different levels of quality and cost, as a function of the date, funding, and strategy of their project.  A bot more expensive than any (speedup-adjusted) human wage is of little direct value, but exclusive rights to make a bot costing below most human wages would be worth many trillions of dollars.

It may well be socially cost-effective to start a bot-building project with a 1% chance of success when its cost falls to the trillion dollar level.  But not only would successful investors probably only gain a small fraction of this net social value, is unlikely any investor group able to direct a trillion could be convinced the project was feasible – there are just too many smart-looking idiots making crazy claims around.

But when the cost to try a 1% project fell below a billion dollars, dozens of groups would no doubt take a shot.  Even if they expected the first feasible bots to be very expensive, they might hope to bring that cost down quickly.  Even if copycats would likely profit more than they, such an enormous prize would still be very tempting.

The first priority for a bot project would be to create as much emulation fidelity as affordable, to achieve a functioning emulation, i.e., one you could talk to and so on.  Few investments today are allowed a decade of red ink, and so most bot projects would fail within a decade, their corpses warning others about what not to try.  Eventually, however, a project would succeed in making an emulation that is clearly sane and cooperative.

Continue reading "Emulations Go Foom" »

GD Star Rating
loading...
Tagged as: , ,

AI Go Foom

It seems to me that it is up to [Eliezer] to show us how his analysis, using his abstractions, convinces him that, more likely than it might otherwise seem, hand-coded AI will come soon and in the form of a single suddenly super-powerful AI.

As this didn’t prod a response, I guess it is up to me to summarize Eliezer’s argument as best I can, so I can then respond.  Here goes:

A machine intelligence can directly rewrite its entire source code, and redesign its entire physical hardware.  While human brains can in principle modify themselves arbitrarily, in practice our limited understanding of ourselves means we mainly only change ourselves by thinking new thoughts.   All else equal this means that machine brains have an advantage in improving themselves. 

A mind without arbitrary capacity limits, that focuses on improving itself, can probably do so indefinitely.  The growth rate of its "intelligence" may be slow when it is dumb, but gets faster as it gets smarter.  This growth rate also depends on how many parts of itself it can usefully change.  So all else equal, the growth rate of a machine intelligence must be greater than the growth rate of a human brain. 

No matter what its initial disadvantage, a system with a faster growth rate eventually wins.  So if the growth rate advantage is large enough then yes a single computer could well go in a few days from less than human intelligence to so smart it could take over the world.  QED.

So Eliezer, is this close enough to be worth my response?  If not, could you suggest something closer?

GD Star Rating
loading...
Tagged as:

MacAskill on Value Lock-In

Will MacAskill has a new book out today, What We Owe The Future, most of which I agree with, even if that doesn’t exactly break new ground. Yes, the future might be very big, and that matters a lot, so we should be willing to do a lot to prevent extinction, collapse, or stagnation. I hope his book induces more careful future analysis, such as I tried in Age of Em. (FYI, MacAskill suggested that book’s title to me.) I also endorse his call for more policy and institutional experimentation. But, as is common in book reviews, I now focus on where I disagree.

Aside from the future being important, MacAskill main concern in his book is “value lock-in”, by which he means a future point in time when the values that control actions stop changing. But he actually mixes up two very different processes by which this result might arise. First, an immortal power with stable values might “take over the world”, and prevent deviations from its dictates. Second, in a stable universe decentralized competition between evolving entities might pick out some most “fit” values to be most common.

MacAskill’s most dramatic predictions are about this first “take over” process. He claims that the next century or so is the most important time in all of human history:

We hold the entire future in our hands. … By choosing wisely, we can be pivotal in putting humanity on the right course. … The values that humanity adopts in the next few centuries might shape the entire trajectory of the future. … Whether the future is government by values that are authoritarian or egalitarian, benevolent or sadistic, exploratory or rigid, might well be determined by what happens this century.

His reason: we will soon create AGI, or ems, who, being immortal, have forever stable values. Some org will likely use AGI to “take over the world”, and freeze in their values forever:

Advanced artificial intelligence could enable those in power to to lock in their values indefinitely. … Since [AGI] software can be copied with high fidelity, an AGI can survive changes in the hardware instantiating it. AGI agents are potentially immortal. These two features of AGI – potentially rapid technological progress and in-principle immortality – combine to make value lock-in a real possibility. …

Using AGI, there are a number of ways that people could extend their values much farther into the future than ever before. First, people may be able to create AGI agents with goals closely assigned with their own which would act on their behalf. … [Second,] the goals of an AGI could be hard-coded: someone could carefully specify what future white want to see and ensure that the AGI aims to achieve it. … Third, people could potentially “upload”. …

International organizations or private actors may be able to leverage AGI to attain a level of power not seen since the days of the East India Company, which in effect ruled large areas of India. …

A single set of values could emerge. …The ruling ideology could in principle persist as long as civilization does. AGI systems could replicate themselves as many times as they wanted, just as easily as we can replicate software today. They would be immortal, freed from the biological process of aging, able to create back-ups of themselves and copy themselves onto new machines. … And there would not longer be competing value systems that could dislodge the status quo. …

Bostrom’s book Superintelligence. The scenario most closely associated with that book is one in which a single AI agent … quickly developing abilities far greater than the abilities of all of humanity combined. … It would therefore be incentivize to take over the world. … Recent work has looked at a broader range of scenarios. The move from subhuman intelligence to super intelligence need not be ultrafast or discontinuous to post a risk. And it need not be a single AI that takes over; it could be many. …

Values could become even more persistent in the future if a single value system were to become global dominant. If so, then the absence of conflict and competition would remove one reason for change in values over time. Conquest is the most dramatic pathway … and it may well be the most likely.

Now mere immortality seems far from sufficient to create either value stability or a takeover. On takeover, immortality is insufficient. Not only is a decentralized world of competing immortals easy to imagine, but in fact until recently individual bacteria, who very much compete, were thought to be immortal.

On values, immortality also seems far from sufficient to induce stable values. Human organizations like firms, clubs, cities, and nations seem to be roughly immortal, and yet their values often greatly change. Individual humans change their values over their lifetimes. Computer software is immortal, and yet its values often change, and it consistently rots. Yes, as I mentioned in my last post, some imagine that AGIs have a special value modularity that can ensure value stability. But we have many good reasons to doubt that scenario.

Thus MacAskill must be positing that a power who somehow manages to maintain stable values takes over and imposes its will everywhere forever. Yet the only scenario he points to that seems remotely up to this task is Bostrom’s foom scenario. MacAskill claims that other scenarios are also relevant, but doesn’t even try to show how they could produce this result. For reasons I’ve given many times before, I’m skeptical of foom-like scenarios.

Furthermore, let me note that even if one power came to dominate Earth’s civilization for a very long time, it would still have to face competition from other grabby aliens in roughly a billion years. If so, forever just isn’t at issue here.

While MacAskill doesn’t endorse any regulations to deal with this stable-AGI-takes-over scenario, he does endorse regulations to deal with the other path to value stability: evolution. He wants civilization to create enough of a central power that it could stop change for a while, and also limit competition between values.

The theory of cultural evolution explains why many moral changes are contingent. … the predominant culture tends to entrench itself. … results in a world increasingly dominated by cultures with traits that encourage and enable entrenchment and thus persistence. …

If we don’t design our institutions to govern this transition well – preserving a plurality of values and the possibility of desirable moral progress. …

A second way for a culture to become more powerful is immigration [into it]. … A third way in which a cultural trait can gain influence is if it gives one group greater ability to survive or thrive in a novel environment. … A final way in which one culture can outcompete another is via population growth. … If the world converged on a single value system, there would be much less pressure on those values to change over time.

We should try to ensure that we have made as much moral progress as possible before any point of lock-in. … As an ideal, we could aim for what we could call the long reflection: a stable state of the world in which we are safe from calamity and can reflect on and debate the nature of the good life, working out what the more flourishing society would be. … It would therefore be worth spending many centuries to ensure that we’ve really figured things out before taking irreversible actions like locking in values or spreading across the stars. …

We would need to keep our options open as much as possible … a reason to prevent smaller-scale lock-ins … would favor political experimentation – increasing cultural and political diversity, if possible. …

That one society has greater fertility than another or exhibits faster economic growth does not imply that society is morally superior. In contrast, the most important mechanisms for improving our moral views are reason, reflection, and empathy, and the persuasion of others based on those mechanisms. … Certain forms of free speech would therefore be crucial to enable better ideas to spread. …

International norms or laws preventing any single country from becoming too populous, just as anti-trust regulation prevents any single company from dominating a market. … The lock-in paradox. We need to lock-in some institutions and ideas in order to prevent a more thorough-going lock-in of values. … If we wish to avoid the lock-in of bad moral views, an entirely laissez-faire approach would not be possible; over time, the forces of cultural evolution would dictate how the future goes, and the ideologies that lead to the greatest military powered that try to eliminate their competition would suppress all others.

I’ve recently described my doubts that expert deliberation has been a large force in value change so far. So I’m skeptical that will be a large force in the future. And the central powers (or global mobs) sufficient to promote a long reflection, or to limit nations competing, seem to risk creating value stability via the central dominance path discussed above. MacAskill doesn’t even consider this kind of risk from his favored regulations.

While competition may produce a value convergence in the long run, my guess is that convergence will happen a lot faster if we empower central orgs or mobs to regulate competition. I think that a great many folks prefer that latter scenario because they believe we know what are the best values, and fear that those values would not win an evolutionary competition. So they want to lock in current values via regs to limit competition and value change.

To his credit, MacAskill is less confident that currently popular values are in fact the best values. And his favored solution of more deliberation probably would’t hurt. I just don’t think he realizes just how dangerous are central powers able to regulate to promote deliberation and limit competition. And he seems way too confident about the chance of anything like foom soon.

GD Star Rating
loading...
Tagged as: ,

AGI Is Sacred

Sacred things are especially valuable, sharply distinguished, and idealized as having less decay, messiness, inhomogeneities, or internal conflicts. We are not to mix the sacred (S) with the non-sacred (NS), nor to trade S for NS. Thus S should not have clear measures or money prices, and we shouldn’t enforce rules that promote NS at S expense.

We are to desire S “for itself”, understand S intuitively not cognitively, and not choose S based on explicit calculation or analysis. We didn’t make S; S made us. We are to trust “priests” of S, give them more self-rule and job tenure, and their differences from us don’t count as “inequality”. Objects, spaces, and times can become S by association. (More)

When we treat something as sacred, we acquire the predictably extreme related expectations and values characteristic of our concept of “sacred”. This biases us in the usual case where such extremes are unreasonable. (To min such biases, try math as sacred.)

For example, most ancient societies had a great many gods, with widely varying abilities, features, and inclinations. And different societies had different gods. But while the ancients treated these gods as pretty sacred, Christians (and Jews) upped the ante. They “knew” from their God’s recorded actions that he was pretty long-lasting, powerful, and benevolent. But they moved way beyond those “facts” to draw more extreme, and thus more sacred, conclusions about their God.

For example, Christians came to focus on a single uniquely perfect God: eternal, all-powerful, all-good, omnipresent, all-knowing (even re the future), all-wise, never-changing, without origin, self-sufficient, spirit-not-matter, never lies nor betrays trust, and perfectly loving, beautiful, gracious, kind, and pretty much any other good feature you can name. The direction, if not always the magnitude, of these changes is well predicted by our sacredness concept.

It seems to me that we’ve seen a similar process recently regarding artificial intelligence. I recall that, decades ago, the idea that we could make artificial devices who could do many of the kinds of tasks that humans do, even if not quite as well, was pretty sacred. It inspired much reverence, and respect for its priests. But just as Christians upped the ante regarding God, many recently have upped the AI ante, focusing on an even more sacred variation on AI, namely AGI: artificial general intelligence.

The default AI scenario, the one that most straightforwardly projected past trends into the future, would go as follows. Many kinds of AI systems would specialize in many different tasks, each built and managed by different orgs. There’d also be a great many AI systems of each type, controlled by competing organizations, of roughly comparable cost-effectiveness.

Overall, the abilities of these AI would improve at roughly steady rates, with rate variations similar to what we’ve seen over the last seventy years. Individual AI systems would be introduced, rise in influence for a time, and then decline in influence, as they rotted and become obsolete relative to rivals. AI systems wouldn’t work equally well with all other systems, but would instead have varying degrees of compatibility and integration.

The fraction of GDP paid for such systems would increase over time, and this would likely lead to econ growth rate increases, perhaps very large ones. Eventually many AI systems would reach human level on many tasks, but then continue to improve. Different kinds of system abilities would reach human level at different times. Even after this point, most all AI activity would be doing relatively narrow tasks.

The upped-ante version of AI, namely AGI, instead changes this scenario in the direction of making it more sacred. Compared to AI, AGI is idealized, sharply distinguished from other AI, and associated with extreme values. For example:

1) Few discussions of AGI distinguish different types of them. Instead, there is usually just one unspecialized type of AGI, assumed to be at least as good as humans at absolutely everything.

2) AGI is not a name (like “economy” or “nation”) for a diverse collection of tools run by different orgs, tools which can all in principle be combined, but not always easily. An AGI is instead seen as a highly integrated system, fully and flexibly able to apply any subset its tools to any problem, without substantial barriers such as ownership conflicts, different representations, or incompatible standards.

3) An AGI is usually seen as a consistent and coherent ideal decision agent. For example, its beliefs are assumed all consistent with each other, fully updated on all its available info, and its actions are all part of a single coherent long-term plan. Humans greatly deviate from this ideal.

4) Unlike most human organizations, and many individual humans, AGIs are assumed to have no internal conflicts, where different parts work at cross purposes, struggling for control over the whole. Instead, AGIs can last forever maintaining completely reliable internal discipline.

5) Today virtually all known large software systems rot. That is, as they are changed to add features and adapt to outside changes, they gradually become harder to usefully modify, and are eventually discarded and replaced by new systems built from scratch. But an AGI is assumed to suffer no such rot. It can instead remain effective forever.

6) AGIs can change themselves internally without limit, and have sufficiently strong self-understanding to apply this ability usefully to all of their parts. This ability does not suffer from rot. Humans and human orgs are nothing like this.

7) AGIs are usually assumed to have a strong and sharp separation between a core “values” module and all their other parts. It is assumed that value tendencies are not in any way encoded into the other many complex and opaque modules of an AGI system. The values module can be made frozen and unchanging at no cost to performance, even in the long run, and in this way an AGI’s values can stay constant forever.

8) AGIs are often assumed to be very skilled, even perfect, at cooperating with each other. Some say that is because they can show each other their read-only values modules. In this case, AGI value modules are assumed to be small, simple, and standardized enough to be read and understood by other AGIs.

9) Many analyses assume there is only one AGI in existence, with all other humans and artificial systems at the time being vastly inferior. In fact this AGI is sometimes said to be more capable than the entire rest of the world put together. Some justify this by saying multiple AGIs cooperate so well as to be in effect a single AGI.

10) AGIs are often assumed to have unlimited powers of persuasion. They can convince humans, other AIs, and organizations of pretty much any claim, even claims that would seem to be strongly contrary to their interests, and even if those entities are initially quite wary and skeptical of the AGI, and have AI advisors.

11) AGIs are often assumed to have unlimited powers of deception. They could pretend to have one set of values but really have a completely different set of values, and completely fool the humans and orgs that developed them ever since they grew up from a “baby” AI. Even when those had AI advisors. This super power of deception apparently applies only to humans and their organizations, but not to other AGIs.

12) Many analyses assume a “foom” scenario wherein this single AGI in existence evolves very quickly, suddenly, and with little warning out of far less advanced AIs who were evolving far more slowly. This evolution is so fast as to prevent the use of trial and error to find and fix its problematic aspects.

13) The possible sudden appearance, in the not-near future, of such a unique powerful perfect creature, is seen by many as event containing overwhelming value leverage, for good or ill. To many, trying to influence this event is our most important and praise-worthy action, and its priests are the most important people to revere.

I hope you can see how these AGI idealizations and values follow pretty naturally from our concept of the sacred. Just as that concept predicts the changes that religious folks seeking a more sacred God made to their God, it also predicts that AI fans seeking a more sacred AI would change it in these directions, toward this sort of version of AGI.

I’m rather skeptical that actual future AI systems, even distant future advanced ones, are well thought of as having this package of extreme idealized features. The default AI scenario I sketched above makes more sense to me.

Added 7a: In the above I’m listing assumptions commonly made about AGI in AI risk discussions, not applying a particular definition of AGI.

GD Star Rating
loading...
Tagged as: ,