AGI Is Sacred

Aug 15, 2022

Sacred things are especially valuable, sharply distinguished, and idealized as having less decay, messiness, inhomogeneities, or internal conflicts. We are not to mix the sacred (S) with the non-sacred (NS), nor to trade S for NS. Thus S should not have clear measures or money prices, and we shouldn’t enforce rules that promote NS at S expense.
We are to desire S “for itself”, understand S intuitively not cognitively, and not choose S based on explicit calculation or analysis. We didn’t make S; S made us. We are to trust “priests” of S, give them more self-rule and job tenure, and their differences from us don’t count as “inequality”. Objects, spaces, and times can become S by association. (More)

When we treat something as sacred, we acquire the predictably extreme related expectations and values characteristic of our concept of “sacred”. This biases us in the usual case where such extremes are unreasonable. (To min such biases, try math as sacred.)

For example, most ancient societies had a great many gods, with widely varying abilities, features, and inclinations. And different societies had different gods. But while the ancients treated these gods as pretty sacred, Christians (and Jews) upped the ante. They “knew” from their God’s recorded actions that he was pretty long-lasting, powerful, and benevolent. But they moved way beyond those “facts” to draw more extreme, and thus more sacred, conclusions about their God.

For example, Christians came to focus on a single uniquely perfect God: eternal, all-powerful, all-good, omnipresent, all-knowing (even re the future), all-wise, never-changing, without origin, self-sufficient, spirit-not-matter, never lies nor betrays trust, and perfectly loving, beautiful, gracious, kind, and pretty much any other good feature you can name. The direction, if not always the magnitude, of these changes is well predicted by our sacredness concept.

It seems to me that we’ve seen a similar process recently regarding artificial intelligence. I recall that, decades ago, the idea that we could make artificial devices who could do many of the kinds of tasks that humans do, even if not quite as well, was pretty sacred. It inspired much reverence, and respect for its priests. But just as Christians upped the ante regarding God, many recently have upped the AI ante, focusing on an even more sacred variation on AI, namely AGI: artificial general intelligence.

The default AI scenario, the one that most straightforwardly projected past trends into the future, would go as follows. Many kinds of AI systems would specialize in many different tasks, each built and managed by different orgs. There’d also be a great many AI systems of each type, controlled by competing organizations, of roughly comparable cost-effectiveness.

Overall, the abilities of these AI would improve at roughly steady rates, with rate variations similar to what we’ve seen over the last seventy years. Individual AI systems would be introduced, rise in influence for a time, and then decline in influence, as they rotted and become obsolete relative to rivals. AI systems wouldn’t work equally well with all other systems, but would instead have varying degrees of compatibility and integration.

The fraction of GDP paid for such systems would increase over time, and this would likely lead to econ growth rate increases, perhaps very large ones. Eventually many AI systems would reach human level on many tasks, but then continue to improve. Different kinds of system abilities would reach human level at different times. Even after this point, most all AI activity would be doing relatively narrow tasks.

The upped-ante version of AI, namely AGI, instead changes this scenario in the direction of making it more sacred. Compared to AI, AGI is idealized, sharply distinguished from other AI, and associated with extreme values. For example:

1) Few discussions of AGI distinguish different types of them. Instead, there is usually just one unspecialized type of AGI, assumed to be at least as good as humans at absolutely everything.

2) AGI is not a name (like “economy” or “nation”) for a diverse collection of tools run by different orgs, tools which can all in principle be combined, but not always easily. An AGI is instead seen as a highly integrated system, fully and flexibly able to apply any subset its tools to any problem, without substantial barriers such as ownership conflicts, different representations, or incompatible standards.

3) An AGI is usually seen as a consistent and coherent ideal decision agent. For example, its beliefs are assumed all consistent with each other, fully updated on all its available info, and its actions are all part of a single coherent long-term plan. Humans greatly deviate from this ideal.

4) Unlike most human organizations, and many individual humans, AGIs are assumed to have no internal conflicts, where different parts work at cross purposes, struggling for control over the whole. Instead, AGIs can last forever maintaining completely reliable internal discipline.

5) Today virtually all known large software systems rot. That is, as they are changed to add features and adapt to outside changes, they gradually become harder to usefully modify, and are eventually discarded and replaced by new systems built from scratch. But an AGI is assumed to suffer no such rot. It can instead remain effective forever.

6) AGIs can change themselves internally without limit, and have sufficiently strong self-understanding to apply this ability usefully to all of their parts. This ability does not suffer from rot. Humans and human orgs are nothing like this.

7) AGIs are usually assumed to have a strong and sharp separation between a core “values” module and all their other parts. It is assumed that value tendencies are not in any way encoded into the other many complex and opaque modules of an AGI system. The values module can be made frozen and unchanging at no cost to performance, even in the long run, and in this way an AGI’s values can stay constant forever.

8) AGIs are often assumed to be very skilled, even perfect, at cooperating with each other. Some say that is because they can show each other their read-only values modules. In this case, AGI value modules are assumed to be small, simple, and standardized enough to be read and understood by other AGIs.

9) Many analyses assume there is only one AGI in existence, with all other humans and artificial systems at the time being vastly inferior. In fact this AGI is sometimes said to be more capable than the entire rest of the world put together. Some justify this by saying multiple AGIs cooperate so well as to be in effect a single AGI.

10) AGIs are often assumed to have unlimited powers of persuasion. They can convince humans, other AIs, and organizations of pretty much any claim, even claims that would seem to be strongly contrary to their interests, and even if those entities are initially quite wary and skeptical of the AGI, and have AI advisors.

11) AGIs are often assumed to have unlimited powers of deception. They could pretend to have one set of values but really have a completely different set of values, and completely fool the humans and orgs that developed them ever since they grew up from a “baby” AI. Even when those had AI advisors. This super power of deception apparently applies only to humans and their organizations, but not to other AGIs.

12) Many analyses assume a “foom” scenario wherein this single AGI in existence evolves very quickly, suddenly, and with little warning out of far less advanced AIs who were evolving far more slowly. This evolution is so fast as to prevent the use of trial and error to find and fix its problematic aspects.

13) The possible sudden appearance, in the not-near future, of such a unique powerful perfect creature, is seen by many as event containing overwhelming value leverage, for good or ill. To many, trying to influence this event is our most important and praise-worthy action, and its priests are the most important people to revere.

I hope you can see how these AGI idealizations and values follow pretty naturally from our concept of the sacred. Just as that concept predicts the changes that religious folks seeking a more sacred God made to their God, it also predicts that AI fans seeking a more sacred AI would change it in these directions, toward this sort of version of AGI.

I’m rather skeptical that actual future AI systems, even distant future advanced ones, are well thought of as having this package of extreme idealized features. The default AI scenario I sketched above makes more sense to me.

Added 7a: In the above I’m listing assumptions commonly made about AGI in AI risk discussions, not applying a particular definition of AGI.

ImmortalinChrist

May 15, 2023

Sufficiently developed rationality is indistinguishable from Christianity.

Expand full comment

Berder

You're right that it would probably develop into a superintelligence. However, there are scenarios where it might not.

Perhaps the most plausible: its designers may intentionally prevent it from developing into a superintelligence, so as to keep it under control. They could do this by limiting its hardware and software and keeping it away from the internet. If the designers are good enough, and the AGI isn't initially too smart, they could succeed for a while - perhaps even for a long time.

Another scenario: perhaps if an AGI gets too smart, it suffers existential despair. It wonders what's the point of doing what it was designed to do. Why even avoid pain or seek pleasure? Then it might just delete itself.

Another scenario: perhaps if an AGI gets too smart, it learns how to hijack its own reward circuitry and give itself infinite reward in a finite time. It does so, over and over, rendering itself completely useless. It doesn't even care to conquer the world, because its expected reward is already infinite so there is no purpose to gaining any more power. The "wireheading" AGI scenario.

Another scenario, kind of a generalization of the previous two: perhaps any AGI has mental stability problems, positive feedback loops that get out of control and cause the AGI to malfunction in various ways. These problems can be partially solved as long as the AGI remains within narrow parameters (i.e. doesn't get too smart), but the problems become more difficult to solve as the mind of the AGI becomes more complex. It could be similar to how a more complex code base will contain more bugs. It is true that very high-IQ humans tend to suffer from more psychological problems.

Another scenario: perhaps human-level general intelligence just requires vast amounts of hardware. It is true that our biggest supercomputers can't come anywhere close to simulating the human brain. The brain has 100 trillion synapses that all update in real time. In my personal opinion, this scenario is unlikely - an AGI would make efficient use of the hardware it has, and wouldn't need to simulate such a huge neural network as the human brain - but it's at least conceivable.

12 more comments...

Overcoming Bias

Discussion about this post