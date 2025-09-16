Today Yudkowsky & Soares published their book If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All. I spent the day reading it.

Their core arguments (my paraphrase):

Knowing that a mind was evolved by natural selection, or by training on data, tells you little about what it will want outside of that selection or training context. For example, it would have been very hard to predict that humans would like ice cream, sucralose, or sex with contraception. Or that peacocks would like giant colorful tails. Analogously, training an AI doesn’t let you predict what it will want long after it is trained. Thus we can’t predict what the AIs we start today will want later when they are far more powerful, and able to kill us. To achieve most things they could want, they will kill us. QED. Also, minds states that feel happy and joyous, or embody valuable in any way, are quite rare, and so quite unlikely to result from any given selection or training process. Thus future AIs will embody little value.

These arguments seem to me to prove way too much, as their structure applies to any changed descendants, not just AIs: any descendants who change from how we are today due to something like training or natural selection won’t be happy or joyous, or embody value, and they’ll kill any other creatures less powerful than they.

Let us break future creatures into any two categories of a small us vs a big them. As we can’t predict what they will want later, and they will be much bigger than us later, then we can predict that they will kill us later. Thus we must prevent any changed big future they from existing. Except, as neither us nor they are happy or joyous later, who cares?

Some I’ve talked to accept my summary above, but say that the difference with AI is that it might change faster than would other descendants. But I’m not sure why I should care about clock time, relative to the rates of events experienced by key creatures. Others say that humans are just much less pliable in their desires than are AIs, but I see much less difference there; human culture makes us quite pliable.

We can reasonably doubt the extreme claim here that one can predict nothing at all from knowing of prior selection or training experience. And also the strong claim that all influence must happen early, after which all influence is lost. There could instead be a long period where we could react to and reward or punish varying behavior.

Some relevant quotes: