Reply to Christiano on AI Risk

Robin Hanson

Oct 2, 2017

Paul Christiano was one of those who encouraged me to respond to non-foom AI risk concerns.

Read →

23 Comments

RobinHanson

May 15, 2023

I've never claimed that no one should worry about the foom scenario.

Expand full comment

Kevin S Van Horn

May 15, 2023

How confident are you that something like the "foom" scenario for superintelligent AI won't occur? 90%? 99%? When the stakes are very high, even a small probability is worth worrying about.

Expand full comment

RobinHanson

May 15, 2023

The issue is to categorize the types of risk so as to prioritize efforts. The optimal types of efforts for the scenarios I'm describing seem different than for foom.

Expand full comment

Peter McCluskey

May 15, 2023

I'm having trouble seeing that difference. It seems like the default assumption ought to be that humans face a nontrivial risk of extinction in this scenario. It's unclear why you seem unconcerned by that risk.

Expand full comment

RobinHanson

May 15, 2023

I'm not saying we can all pay to get everything we want. But there is a list of things that we can pay to get today, and I'm saying we should still be able to pay to get those things in a world full of AI.

For the long run, there's a big difference between fearing that your wealth, or your wealth fraction will decline, because you are not a competitive producer, and fearing extermination.

Expand full comment

Peter McCluskey

May 15, 2023

I'm puzzled by this claim:

>It isn’t that you can’t reliably use these systems to do the things that you want. You just have to pay more.

I thought that half the point of your paper He Who Pays The Piper Must Know The Tune was that people fail to get what they say they want from doctors, newspapers, professors, etc., and that simply paying more won't get them what they asked for (health, knowledge, etc.). And that one solution to that problem is value alignment ("experts caring directly about being honest or about client topics").

I understand how you can believe humans will get more of what they want for a while in a world dominated by unaligned AI. But I don't see how long-term human prospects in such a world are any better than the prospects of modern hunter-gatherers.

Expand full comment

RobinHanson

May 15, 2023

If democracy can work today to control law enforcement, why can't it work for AI based law enforcement?

Expand full comment

Overcoming Bias Commenter

May 15, 2023

The developed world has found a political equilibrium where democracy can mostly control law enforcement, but there's no robust understanding of what makes this possible. Attempts to export this equilibrium tend to fail, and kleptocracy is the historical norm.

Expand full comment

RobinHanson

May 15, 2023

Not sure I agree with your "mostly" and "sliver" claims.

Expand full comment

RobinHanson

May 15, 2023

I'm comfortable claiming that future competent software will be complex in design.

Expand full comment

RobinHanson

May 15, 2023

How do you think we address the control problem of law enforcement today?

Expand full comment

RobinHanson

May 15, 2023

That's what this post is about: http://www.overcomingbias.c...

Expand full comment

Yosarian2

May 15, 2023

Even if AI feel bound to follow our legal and economic system, which itself would require a great deal of AI control theory and FAI research to get right but let's assume that it does, that's still not necessarally a good scenario for us. If AI's become better at working the system then we are, if they become better at finding loopholes in the law, better at investing, better at amassing wealth, better at figuring out how to manipulate the political system, ect, then humans could basically lose control so long as humans are also bound by those same rules. We could be talking about general AI's that are generally smarter then us, or about more narrow AI's that are individually better at us in one specific field (law, investing, banking, lobbying, ect); either way, we would be out-competed, and increasingly powerless in terms of working through the system.

Expand full comment

Vladimir Nesov

May 15, 2023

This is helpful. So this future is built mostly from unaligned AIs which mostly do their own thing rather than anything of value, but the dynamic by which the future is built ensures that taken together, the future is still valuable and probably doesn't exterminate humans. Lack of foom is important in this dynamic, in that any AI disasters are not global catastrophes and get paved over by the rest of the world, just as we do today with rogue institutions and countries. Large agency costs mean that AIs remain unaligned, even though the world of AIs robustly maintains a sliver of alignment. Not being under humans' control, the world as a whole doesn't converge on being efficiently aligned with humans, instead humans get a small fraction of global resources and unaligned AIs get most of them. Still, human values prosper relative to pre-AI world.

In these terms, one worry is that opportunity cost is astronomical compared to a world of aligned AIs that doesn't have significant agency costs. So current work on AI alignment is valuable for reducing this opportunity cost, by making the AIs more aligned, but that work will be motivated by agency costs anyway, throughout the process of losing control over the world as a whole. Unfortunately, the process seems irreversible, most of the whole future is given to agency costs and can't be recaptured even when we eventually figure out alignment.

Another problem is that this world is a lot different from today and humans are not crucial to its maintenance, so there is risk that one of the details that changes is that somehow all human value/influence is eventually gone, despite presently available convincing arguments that this won't be the case.

Expand full comment

RobinHanson

May 15, 2023

In the context of foom, the usual AI concern is a total loss of control of the one super AI, whose goals quickly drift to a random point in the space of possible goals. Humans are then robustly exterminated. As the AI is so smart and inscrutable, any small loss of control is said to open the door to such extreme failure. So when you tell me that I shouldn’t focus so much on foom, as many are similarly concerned about non-foom scenarios, I presume that the focus remains on this sort of failure.

Today most social systems suffer from agency costs, and larger costs (in % terms) in larger systems. But these mostly take the form of modestly increasing costs. It isn’t that you can’t reliably use these systems to do the things that you want. You just have to pay more. That extra cost mostly isn’t a transfer accumulating in someone’s account. Instead there is just waste that goes to no one, and there are more cushy jobs and roles where people can comfortably sit as parasites. Over time, even though agency costs take a bigger cut, total costs get lower and humans get more of what they want.

When I say that in my prototypical non-foom AI scenario, AI will still pay agency costs but the AI control problem is mostly manageable, I mean that very competent future systems will suffer from waste and parasites as do current systems, but that humans can still reliably use such systems to get what they want. Not only are humans not exterminated, they get more than before of what they want.

Expand full comment

Joe

May 15, 2023

Paul, what's the mechanism by which you expect problems of the kind Robin mentions - agency costs, design entrenchment, etc. - to not be an issue with systems that don't involve humans? Is it that you think there are coordination-enabling features that can be implemented to solve these problems even in big complex systems, and biological evolution just didn't happen to stumble across them? Or is it that you think these issues are inherent and unavoidable as systems grow larger and more detailed (as Robin claims), but you just think human-level AI will be pretty simple in design, not large and complex enough to suffer from these issues? Or is it something else?

Robin, would you say this is an accurate characterisation of your position? That is, you expect advancing AI to accrue problems like agency costs precisely because you expect that for it to reach human-equivalent capability it will need layers upon layers of intricate detail; and so if it turned out human-level will be barely more complicated than today's AI software, it would then be more plausible to talk about controlling it and its descendants by directly programming it to have the right values?

Perhaps we could test these competing models by looking at the extent to which systems with varying levels of human involvement suffer from the problems mentioned, after controlling for system complexity. Viable?

Expand full comment