Foom Justifies AI Risk Efforts Now

Years ago I was honored to share this blog with Eliezer Yudkowsky. One of his main topics then was AI Risk; he was one of the few people talking about it back then. We debated this topic here, and while we disagreed I felt we made progress in understanding each other and exploring the issues. I assigned a much lower probability than he to his key “foom” scenario.

Recently AI risk has become something of an industry, with far more going on than I can keep track of. Many call working on it one of the most effectively altruistic things one can possibly do. But I’ve searched a bit and as far as I can tell that foom scenario is still the main reason for society to be concerned about AI risk now. Yet there is almost no recent discussion evaluating its likelihood, and certainly nothing that goes into as much depth as did Eliezer and I. Even Bostrom’s book length treatment basically just assumes the scenario. Many seem to think it obvious that if one group lets one AI get out of control, the whole world is at risk. It’s not (obvious).

As I just revisited the topic while revising Age of Em for paperback, let me try to summarize part of my position again here.

For at least a century, every decade or two we’ve seen a burst of activity and concern about automation. The last few years have seen another such burst, with increasing activity in AI research and commerce, and also increasing concern expressed that future smart machines might get out of control and destroy humanity. Some argue that these concerns justify great efforts today to figure out how to keep future AI under control, and to more closely watch and constrain AI research efforts. Approaches considered include kill switches, requiring prior approval for AI actions, and designing AI motivational system to make AIs want to help, and not destroy, humanity.

Consider, however, an analogy with organizations. Today, the individuals and groups who create organizations and their complex technical systems are often well-advised to pay close attention to how to maintain control of such organizations and systems. A loss of control can lead not only to a loss of the resources invested in creating and maintaining such systems, but also to liability and retaliation from the rest of the world.

But exactly because individuals usually have incentives to manage their organizations and systems reasonably well, the rest of us needn’t pay much attention to the internal management of others’ organizations. In our world, most firms, cities, nations, and other organizations are much more powerful and yes smarter than are most individuals, and yet they remain largely under control in most important ways. For example, none have so far destroyed the world. Smaller than average organizations can typically exist and even thrive without being forcefully absorbed into larger ones. And outsiders can often influence and gain from organization activities via control institutions like elections, board of directors, and voting stock.

Mostly, this is all achieved neither via outside action approval nor via detailed knowledge and control of motivations. We instead rely on law, competition, social norms, and politics. If a rogue organization seems to harm others, it can be accused of legal violations, as can its official owners and managers. Those who feel hurt can choose to interact with it less. Others who are not hurt may choose to punish the rogue informally for violating informal norms, and get rewarded by associates for such efforts. And rogues may be excluded from political coalitions, who hurt it via the policies of governments and other large organizations.

AI and other advanced technologies may eventually give future organizations new options for internal structures, and those introducing such innovations should indeed consider their risks for increased chances of losing control. But it isn’t at all clear why the rest of us should be much concerned about this, especially many decades or centuries before such innovations may appear. Why can’t our usual mechanisms for keeping organizations under control, outlined above, keep working? Yes, innovations might perhaps create new external consequences, ones with which those outside of the innovating organization would need to deal. But given how little we now understand about the issues, architectures, and motivations of future AI systems, why not mostly wait and deal with any such problems later?

Yes, our usual methods do fail at times; we’ve had wars, revolutions, theft, and lies. In particular, each generation has had to accept slowly losing control of the world to succeeding generations. While prior generations can typically accumulate and then spend savings to ensure a comfortable retirement, they no longer rule the world. Wills, contracts, and other organizational commitments have not been enough to prevent this. Some find this unacceptable, and seek ways to enable a current generation, e.g., humans today, to maintain strong control over all future generations, be they biological, robotic or something else, even after such future generations have become far more capable than the current generation. To me this problem seems both very hard, and not obviously worth solving.

Returning to the basic problem of rogue systems, some forsee a rapid local “intelligence explosion”, sometimes called “foom”, wherein one initially small system quickly becomes vastly more powerful than the entire rest of the world put together. And, yes, if such a local explosion might happen soon, then it could make more sense for the rest of us today, not just those most directly involved, to worry about how to keep control of future rogue AI.

In a prototypical “foom,” or local intelligence explosion, a single AI system starts with a small supporting team. Both the team and its AI have resources and abilities that are tiny on a global scale. This team finds and then applies a big innovation in system architecture to its AI system, which as a result greatly improves in performance. (An “architectural” change is just a discrete change with big consequences.) Performance becomes so much better that this team plus AI combination can now quickly find several more related innovations, which further improve system performance. (Alternatively, instead of finding architectural innovations the system might enter a capability regime which contains a large natural threshold effect or scale economy, allowing a larger system to have capabilities well out of proportion to its relative size.)

During this short period of improvement, other parts of the world, including other AI teams and systems, improve much less. Once all of this team’s innovations are integrated into its AI system, that system is now more effective than the entire rest of the world put together, at least at one key task. That key task might be theft, i.e., stealing resources from the rest of the world. Or that key task might be innovation, i.e., improving its own abilities across a wide range of useful tasks.

That is, even though an entire world economy outside of this team, including other AIs, works to innovate, steal, and protect itself from theft, this one small AI team becomes vastly better at some combination of (1) stealing resources from others while preventing others from stealing from it, and (2) innovating to make this AI “smarter,” in the sense of being better able to do a wide range of mental tasks given fixed resources. As a result of being better at these things, this AI quickly grows the resources under its control and becomes in effect more powerful than the entire rest of the world economy put together. So, in effect it takes over the world. All of this happens within a space of hours to months.

(The hypothesized power advantage here is perhaps analogous that of the first team to make an atomic bomb, if that team had had enough other supporting resources to enable it to use the bomb to take over the world.)

Note that to believe in such a local explosion scenario, it is not enough to believe that eventually machines will be very smart, even much smarter than are humans today. Or that this will happen soon. It is also not enough to believe that a world of smart machines can overall grow and innovate much faster than we do today. One must in addition believe that an AI team that is initially small on a global scale could quickly become vastly better than the rest of the world put together, including other similar teams, at improving its internal abilities.

If a foom-like explosion can quickly make a once-small system more powerful than the rest of the world put together, the rest of the world might not be able to use law, competition, social norms, or politics to keep it in check. Safety can then depend more on making sure that such exploding systems start from safe initial designs.

In another post I may review arguments for and against the likelihood of foom. But in this one I’m content to just point out that the main reason for society, as opposed to particular projects, to be concerned about AI risk is either foom, or an ambition to place all future generations under the tight control of a current generation. So a low estimate of the probability of foom can imply a much lower social value from working on AI risk now.

Added Aug 4: I made a twitter poll on motives for AI risk concern:

GD Star Rating
loading...
Tagged as: , ,
Trackback URL:
  • AndHis Horse

    If it should turn out that we spent too much time and other resources on AI safety relative to the actual risk, I am not compelled to consider this a terrible scenario; after all, FOOM or no FOOM, this foundation will likely be useful* for whatever AGI does result.

    Additionally, we may not expect conventional control structures to work on an AGI which is truly alien to us. However, as the probability that AGI, including its morality, will be guided by humans’ implicit knowledge – more like training a deep neural network with human raters than setting forth a set of explicit logical commands – this becomes less relevant.

    * On the other hand, it would be reasonable to expect that the further we are from AGI, the less applicable this research will be. On the third hand, it may be that the activity of the field some X years prior to AGI will be utterly inapplicable, but it will produce positive effects on the health and maturity of the field Y years before AGI, which will be applicable.

    • Patrick Staples

      Your first paragraph is compelling to me. The point of X-risk is to start with possible problems, perhaps in some rough order or likelihood, and make progress understanding and avoiding them. If the prior for all such events is small, hooray, life won’t fizzle out. But the marginal benefit of working on identifiable problems is clearly worth the resources allocated to them.

    • Joe

      If it should turn out that we spent too much time and other resources on AI safety relative to the actual risk, I am not compelled to consider this a terrible scenario …

      I see this argument a lot, but I don’t think it’s right. There are boring responses that could be made, like “if FOOM doesn’t happen but we strongly regulate AI research anyway, N people who would have otherwise been saved by AI-related tech will die”, but I see a much more worrying problem.

      To explain: we don’t know yet what human-level AIs will look like. One possibility is they turn out to be mindless optimisers: a variation on something like AIXI, or perhaps a totally general neural network with very few or no hyperparameters, which just learns to learn to learn in a fully abstract way. Another possibility is that future AIs can be reasonably described as ‘creatures’; it may turn out that our brains aren’t a completely terrible attempt at an intelligent mind after all, but that minds roughly similar to ours, with lots of subsystems and detail, are in fact the only way to implement an intelligent entity. So AIs might have ideas, feelings, thoughts, dreams, relationships, they might chatter or work or ponder or play in ways somewhat analogous to the ways we do.

      My point is that the first conception of AI leads far more naturally to a FOOM scenario, and also to the model where we obviously need to take firm control of this new technology to ensure it does what humans want. The second conception of AI doesn’t seem either like it leads to FOOM or that it must be strictly controlled by humans.

      You can think up clever reasons why actually only currently-existing creatures’ preferences count and therefore preventing a vast future civilisation for the benefit of making a few people today a bit happier is the right thing to do after all. But it doesn’t fall out of the scenario naturally the way “we have to ensure this upcoming mindless unstoppable optimisation process optimises for what we want” does from the first scenario.

  • davidmanheim

    I think they key difference between your view and Eliezer’s is that he views AI as likely to be an independent actor – not ” a single AI system starts with a small supporting team,” but rather ‘a single AI started by a supporting team, which quickly makes itself independent of that team.’ This, combined with the ability for the system to rapidly iterate and expand itself or child systems, (say, buying more Amazon compute time,) could allow the system to iterate without the human originator’s knowledge.

    But this is exactly the “AI deployment” question that is beginning to get more traction. There are tons of unresolved questions, and my post simplifies tons of uncertainties into a single described scenario. It’s currently very unclear what probabilities to assign to different possibilities, or even what the appropriate list of possibilities to consider is. A VERY simplified introduction to this question is here; https://80000hours.org/articles/ai-policy-guide/#how-can-safe-deployment-of-broad-scope-ai-systems-be-ensured

  • Joe

    Hmm, I left a reply below and it’s apparently been detected as spam, is this intentional? If not could it please be reinstated?

  • Adam

    I’m not affiliated with any of the groups working on x-risk, so I may well be misunderstanding their positions, but as far as I can tell, many of the people worried about “Foom” do actually seek to impose (a refined form (e.g. “CEV”) of) their preferences onto future generations. As a result, the final condition becomes:

    > One must in addition believe that an AI team that is initially small on a
    global scale could quickly become vastly better than the rest of the
    world put together, **excluding** other similar teams, at improving its
    internal abilities.

    which seems far more likely.

    Consequently, it’s possible that your disagreement is, at least to some extent, due to a disagreement on values.

  • mgoodfel

    Given how lousy computer security is, what do you think of an AI spreading like malware and increasing its resources that way?

    Or, perhaps an AI is good at finance and starts to pull millions of dollars out of stock exchanges, etc. That leaves society with some unappealing choices — can’t ban the bots, because they can just give people the same instructions. Can’t shut down stock exchanges, because it would wreck the financial system. If other players use the same tech, then the exchanges become dominated by a few players and ordinary investors flee. Like program trading on steroids.

    More generally, what if AI follows the “chess program” analogy Tyler Cowen writes about. AI+human still has some advantages over straight AI, but “only human” has no chance. The world is then dependent on AI.

    • Joe

      To a considerable extent, I think the important question is whether AIs can become “good at finance” (or similar) without becoming morally relevant creatures.

    • Robin Hanson

      The world is already greatly dependent on many technologies. And we generally think it is good if competitors are rewarded with more resources for getting good at things, including at finance.

    • http://don.geddis.org/ Don Geddis

      Having computers able to pick stocks better than any human, falls far, far short of a “foom” scenario. What you have described is nowhere near an existential risk to humanity. Being “dependent” on AI is not nearly the same as being controlled by AI.

  • Gunnar Zarncke

    > that foom scenario is still the main reason for society to be concerned about AI risk now. Yet there is almost no recent discussion evaluating its likelihood

    Would you say that it of general benefit to establish mathematical physical bounds on a self-improving autonomous systems? I mean solutions more precise than the conservative approximation of e.g. the Landauer limit but taking dynamical and structural aspects into account (we e.g. know that bacteria reproduce close to the thermodynamic limit and may be able to generalize that to macroscopic growth processes).

    • Robin Hanson

      Not particularly in this context, though that might be of more general interest.

  • Pingback: Rational Feed – deluks917

  • Wei Dai

    >Some find this unacceptable, and seek ways to enable a current generation, e.g., humans today, to maintain strong control over all future generations, be they biological, robotic or something else, even after such future generations have become far more capable than the current generation. To me this problem seems both very hard, and not obviously worth solving.

    To me, given the astronomical stakes, it seems obviously worth trying to control the future, unless you’re very sure about your values, and your values are such that the “default” outcome where you don’t control the future is nearly as good as the one where you do. (If you’re not sure about your values then you’d want to control the future to avoid potentially regretting having lost an astronomical amount of value once you find out what your values actually are.) Given widespread moral and meta-ethical disagreements, it seems obviously wrong to be that certain about one’s values.

    To put it another way, until we have a better idea how to handle moral uncertainty, I think it generally makes sense to follow the Bostrom-Ord Moral Parliament idea, which would suggest devoting a fraction of one’s resources, roughly corresponding to the share of votes in one’s moral parliament controlled by the factions that care greatly about controlling the future, to trying to control the future. This share may be larger or smaller in different individuals depending on their state of moral uncertainty, but it seems obviously wrong for the share to be zero.

    • Joe

      When did the standard perspective among rationalist types shift from utilitarianism to preservation of “our values”? And, why — why do the usual arguments for utilitarianism no longer hold sway?

      Maybe you’ll say that it didn’t change, that the view you’re presenting is still utilitarian. But I think while you could reasonably claim that a utilitarian perspective still must include some consideration of values — for example we might need to evaluate what counts as a being experiencing utility from a human standpoint, looking at the features we normally consider to define a creature as conscious — that still seems very different from stating outright that what matters about the future is how well it matches your values, not how much utility it contains.

      • Wei Dai

        The “standard perspective” at least for me has always been moral uncertainty. See http://www.overcomingbias.com/2009/01/moral-uncertainty-towards-a-solution.html.

      • Joe

        That moral uncertainty model allows the plugging-in of any combination of any number of moral views with any set of confidence levels. The actual recommendations generated by this model look very very different depending on whether the assumed likelihood of utilitarianism being correct is 90% or 30%. So I think the question still stands.

      • Wei Dai

        What I meant to convey was that I never had very high credence in utilitarianism to begin with. And I don’t think that’s very uncommon among “rationalist types”. Plus even utilitarians may use “our values” because it sounds more inclusive than “utility” (i.e., it makes sense for an audience of utilitarians as well as non-utilitarians). So I’d discount this as evidence for a shift away from utilitarianism, and I don’t think I’ve seen other such evidence.

        Aside from that, if you’re curious why people like me aren’t convinced by utilitarianism, unfortunately I don’t know of a good overview of all the counterarguments, but here is one line of thought.

      • Joe

        I think part of what irks me about this focus on values is it gives a false impression of how different a world dominated by AIs would actually be. The argument usually goes: “we have certain specific values, but AIs could have any arbitrary set of values, most of which would be very different from ours; therefore an AI-dominated world is bad by default, due to the almost-certainty that their values won’t match ours”.

        But most of what happens in our world is determined by what’s efficient, not by what our values say, and the same would be true of a world of AIs. So the variation between our world and an AI world is much less than the variation in possible values; if someone likes our world, then to a large extent they’ll probably like an AI world too.

      • http://juridicalcoherence.blogspot.com/ Stephen Diamond

        But most of what happens in our world is determined by what’s efficient, not by what our values say

        What’s not clear is that determination by efficiency is independent of our valuing efficiency (at least in the expenditure of our own energies).

    • Robin Hanson

      I’d been reluctant to publicly state that this is the main actual motive for AI risk efforts, as it seems to me a view that many will see as reflecting badly on holders. I don’t want to insult people without evidence. But if you or others can confirm that this is in fact the main motive ..

      • Wei Dai

        I have little knowledge of other people’s motivations, but from what I can tell from people’s public writings, some people seem clearly motivated mainly by concerns about FOOM, others are worried about property rights not holding up in the face of large capability differentials between humans and AIs. Since the view that I described seems normatively correct to me, I guess it must have occurred to others and is part of some people’s motivations as well.

        Can you please clarify your own views on this? On the one hand you wrote “To me this problem seems both very hard, and not obviously worth solving.” which suggests that the problem may be worth solving, but on the other hand you think it’s insulting to ascribe that view to someone, which suggests that you think it’s obviously wrong. Which is it?

      • Robin Hanson

        I’m able to hold views that others would see as insulting to me. But I see a low chance of our descendants being so alien to us that I’d rather try to enslave them than let them choose.

      • Wei Dai

        I don’t think anyone working on AI risk is trying to enslave our descendants. Everyone I know who is working in that area thinks de novo AI would likely come first ahead of ems and are focused on how to instill human values in such AIs, or how to design such AIs such that they naturally want to help humans remain in control. (Do you interpret that as a form of enslavement? Or do you see people doing something else? I’m confused why you use that word.)

      • Robin Hanson

        Because we will create them, almost surely AI has values more similar to us than does a random possible mind. But that isn’t seen as satisfactory; people seek to make AI values far more similar, and in addition to prevent them from changing their values later. See: http://mason.gmu.edu/~rhanson/ChalmersReply.html

      • Wei Dai

        If by default our AIs will have values as similar to us as our biological children, then nobody would be very worried about AI risk. (At least not for this reason. We might still want to globally coordinate to prevent Malthusian scenarios caused by AIs being easily copyable.) People who are worried about AI risk think there is a significant chance that they will have values that are far more alien (paperclip maximizer is a commonly used example). Also we’re not trying to prevent them from changing their values, we want them to be able to change their values if they have good reasons to (our current values could be wrong so if they are stuck with those that would be an x-risk in itself). None of this should be news to you, which makes me wonder if you’re intentionally trying to make AI alignment/safety people look bad.

      • http://juridicalcoherence.blogspot.com/ Stephen Diamond

        So, the issue is whether the default is values similar to ours. I wonder about the arguments directly bearing here.

      • Robin Hanson

        I agree that people fear AI values will differ more than does one generation of bio children. I agree that they also don’t mind certain kinds of value changes. I still think it fair to say they seek strong controls to prevent larger & other value changes.

      • Wei Dai

        >I still think it fair to say they seek strong controls to prevent larger & other value changes.

        This description seems fair, but it doesn’t seem very different from when parents try to prevent larger & other value changes in their biological children, which nobody calls “enslave”.

        I can also give a direct (as opposed to analogical) explanation for why I see nothing wrong with this. When you create a new agent, you have to give it *some* set of values. It seems obviously correct to give it values that you judge to be good values, which would typically coincide with or be similar to your own values. Once an agent has some set of values, it’s clearly rational for it to not want those values to change randomly for no good reason, so by giving it strong controls, you’re helping that agent (as well as yourself of course). I see nothing here that looks remotely like enslavement or not letting them choose.

      • Robin Hanson

        Most efforts to control the behavior of others, including enslavement, can be phrased in such terms. Have to look at more details to distinguish.

      • Joe

        You’re conflating two very different scenarios. The standard reason to worry about a paperclip maximiser is in a singleton scenario, where we believe one AI will suddenly grow to a position of supreme dominance over everyone else, and will therefore get to rewrite the universe to match its utility function.

        In a multipolar scenario, with trillions of AIs chosen in a decentralised way on a selection/efficiency basis, the question “what if the AI values paperclips, not human flourishing?” doesn’t even make sense because the first obvious response is “which AI?”, and the second is “why would an AI be developed with that utility function anyway, unless it works in a paperclip factory, in which case who cares?”, and the third is “even if one is created where it’s not wanted, it won’t last long in a competitive world, so who cares?”. In other words, what the universe looks like is “vast numbers of AIs in a complex interdependent economy”, not the fully realised version of any one AI’s preferences.

        So why generalise across both of these scenarios when they’re so totally different? For example, your depiction of us just giving the AIs some values, and since we have to give them something we may as well make it match our values, is just irrelevant in the multipolar scenario. This does indeed make sense in a singleton scenario, but in a multipolar world it just doesn’t matter what we give ‘the AIs’; we can’t make any more difference to the future this way than can a business owner by instilling different values into their employees.

      • Wei Dai

        For an alternative to your view of the multipolar scenario, where the values we instill in our AIs do make a difference, see https://rationalaltruist.com/2013/02/27/why-will-they-be-happy/.

      • Joe

        Thanks, that’s interesting. But I don’t think the considerations the article brings up really change the outcome of a multipolar scenario much at all — efficiency still seems to strongly dominate.

        That is, it’s true that those who decide to consume part of their wealth earlier on will be selected out relative to those who reinvest it intending to consume later. Wealth consumed obviously can’t be invested anymore, so by choosing to consume you’re permanently reducing the fraction of future resources you will control. In a given time period, some fritter their money away while others patiently save and wait, and selection retains the second group while removing the first. So as more time passes, those with longer and longer time preferences come to dominate.

        But the longest time preference possible would be to reinvest 100% of your income forever and never consume any of it. Such eternal investors will eventually outcompete even those who intend to consume in the very far future. And (as Christiano notes) if you plan for eventual consumption then to maximise your resources at that point you will have to behave as efficiently as an eternal investor until then anyway.

        So we might see occasional small bursts of consumption in the future, but even so, the later they occur the more efficient behaviour must precede them, and each will be a one-off event, with normal (efficient) operation resuming afterwards.

      • Wei Dai

        >But the longest time preference possible would be to reinvest 100% of your income forever and never consume any of it.

        Eventually you’ll run out of investment opportunities though. All the technologies that are possible in this universe will have been discovered, and all the available resources will be securely held by someone. I guess you might still be reluctant to consume out of fear of being attacked if you reduce your resource reserves too much, but hopefully some sort of coordination mechanism (for example neighboring AIs merging together) will allow everyone to feel safe enough to start consuming.

      • Joe

        There’s no reason to suppose everyone wants to start consuming, though. In fact, as I noted above, as time goes on an increasing fraction of total wealth comes to be owned by those who terminally value survival/reinvestment rather than those who ever intend to consume.

        Also, no growth doesn’t mean stasis, it just means that efforts must be spent on maintaining a given level of wealth rather than increasing it. So a zero or even negative growth world can still be bustling with life.

      • Wei Dai

        I’m confused. You originally said “For example, your depiction of us just giving the AIs some values, and since we have to give them something we may as well make it match our
        values, is just irrelevant in the multipolar scenario.” I’m arguing that it’s not irrelevant, because if we give our AIs the right values, there’s a good chance those AIs (or descendants of them with similar values) will end up in control of a significant fraction of the universe by the time we reach zero growth, and then that fraction of the universe can be consumed in service of those values. You don’t seem to be disputing that, so I’m not sure what your question or concern is at this point.

      • Joe

        My point is that even in the scenario you describe, almost everything that happens is efficient behaviour. So if your moral system has anything to say about the value of such behaviour, I think this will dwarf any consideration it may have for a momentary flash of valued consumption.

      • Wei Dai

        Suppose when growth stops, aligned AIs end up with 10% of the universe. It doesn’t seem implausible that most of the resources of the universe are still unused at that point, and afterwards, each unit of matter/energy used by those AIs in service of human values achieves at least 100 times as much value as matter/energy used by other AIs (and contributing to human values only incidentally).

        For example, suppose 80% the universe is controlled by the equivalent of paperclip maximizers, and they turn almost all of the matter/energy they control into things like paperclips. The remaining 10% is controlled by AIs that just want to survive as long as possible, so they convert all their matter/energy into forms that are as stable as possible and use them as sparingly as possible, and eventually most of the stored matter/energy end up dissipating through proton decay and similar processes.

        I do think there are also plausible multipolar scenarios where the value achieved by aligned AIs gets swamped by what unaligned AIs do. To the extent those scenarios are likely, I think that calls for trying to achieve a value-aligned Singleton via either global coordination or a unilateral tech lead through a Manhattan-like project, although either seems politically very difficult. Others (such as Robin Hanson) might argue for focusing our altruistic efforts on extinction and stagnation risk on the assumption that what unaligned AIs do still have positive value, but I don’t agree because to me it seems just as likely that they actually produce negative value.

      • Joe

        Hmm, I feel we are still talking past each other somewhat.

        I don’t see a comparison of the consumption choices of aligned versus unaligned AIs, once growth is over, as capturing much of what the future actually contains. From now until when growth levels off, all agents that want to maximise their eventual resources must behave the same, funnelling all their efforts into growing their wealth. It’s that economic activity which I’m claiming represents almost all of what happens in the universe.

        (As a separate point, it sounds like your depiction has the zero-growth moment as an epoch even for agents who terminally value growth — that beyond this point they will switch to a quite different strategy — is that right? If so, I don’t see why that would be the case — surely they would just continue whatever behaviours they used to maximise the growth of their investments before, only now it’s only just enough to sustain rather than grow them.)

      • Wei Dai

        >It’s that economic activity which I’m claiming represents almost all of what happens in the universe.

        Yes, I was assuming a scenario where this is false. Why do you think it’s true?

        >it sounds like your depiction has the zero-growth moment as an epoch
        even for agents who terminally value growth — that beyond this point
        they will switch to a quite different strategy — is that right?

        We start in a universe where most of the resources are not owned by anyone, so there is a race to capture as much as possible. After every piece of resource is owned by someone (and assuming secure property rights), I think the subsequent behaviors would be very different. Someone who terminally values growth in terms of resources owned would have no better strategy available than to convert the resources to stable forms and use them as sparingly as possible.

      • Joe

        Yes, I was assuming a scenario where this is false.

        Where is this mentioned?

        Why do you think it’s true?

        A few reasons (some of which I’m not especially confident of):
        – I expect slow growth to continue for a very long time before it eventually levels off;
        – The kind of workers/machines/creatures that a far future economy will be filled with probably are orders of magnitude more efficient than anything particularly humanlike or likely to be considered a part of ‘our values’, in terms of the resources required to create and sustain them;
        – I expect a post-growth world will decline slowly, giving lots of time for those who terminally value wealth to continue to definantly maintain a vibrant-if-dying economy;
        – I don’t expect many AIs actually will be built with explicit utility functions (if any), and so almost all of them will in fact terminally value existence/wealth rather than having some other goal they intend to eventually carry out.

        Which of these do you dispute? I’d add that I think the moral relevance of future economic activity probably depends strongly on whether we’re in a FOOM or non-FOOM scenario. But here we’ve already postulated it to be the latter, which I think implies an economy not totally dissimilar to ours today, comprised of vast numbers of probably-conscious workers producing things, still growing slowly for a long time to come.

        After every piece of resource is owned by someone (and assuming secure property rights), I think the subsequent behaviors would be very different.

        I’m pretty sure this is incorrect — growth doesn’t end when there are no more unowned resources to grab. There is still much work to be done converting those resources into capital. Growth doesn’t end until capital is deteriorating faster than it can be repaired or replaced.

      • Wei Dai

        >But here we’ve already postulated it to be the latter, which I think implies an economy not totally dissimilar to ours today, comprised of vast numbers of probably-conscious workers producing things, still growing slowly for a long time to come.

        I thinking something like, each solar system, or clusters of them, would be controlled by one AI, due to either AIs merging together, or each solar system tending to be first colonized by one seed AI which then prevents further colonization by other AIs. An aligned AI, once it captures a solar system and sends out enough colonizers, would build a large computer then use the rest of the solar system as fuel to power its computations. The computer would be simulating some sort of utopia as determined by what values we ultimately decide are correct, and I’m suggesting that most of the value in our universe could be created in these simulations, because the activities in the other solar systems won’t be generating nearly as much value per unit of matter/energy that’s used up. If it’s not clear why, the idea is that the total amount of conscious experiences generated by an aligned solar system (until the end of the universe) shouldn’t be much less (and could be much more) than that generated by an unaligned solar system of comparable resources, but the average quality/value would be much higher.

      • Joe

        … the total amount of conscious experiences generated by an aligned solar system (until the end of the universe) shouldn’t be much less (and could be much more) than that generated by an unaligned solar system of comparable resources …

        Depends what you mean by an unaligned solar system. I’ll grant that if we’re comparing consumption goods produced to satisfy aligned VS unaligned AIs, there is almost certainly more conscious experience on average in the former than the latter.

        But compared to a solar system run in a way designed to maximise wealth, I think the aligned scenario has much much less conscious experience. One way to explain this: in the aligned scenario, the simulations of humans frolicking in fields or whatever are not producing wealth, they’re consuming it. By contrast, in the investment maximisation scenario, the beings running on the computer are producing wealth, since what they’re doing is working: building and maintaining the computers, the software they run, the infrastructure that supports them; by definition, they’re doing whatever makes their capital last the longest.

        Now it’s possible that the kinds of AIs that are maximally efficient and would make up an investment-focused solar system aren’t conscious. That’s possible, but given how much more potential conscious experience such a scenario would likely hold, I think the actual expected amount still dwarfs what we can expect an aligned solar system to contain.

      • Wei Dai

        My expectation is that an aligned superintelligent AI can build and maintain the kind of computer simulation I described without having to do much work (or rather, without having to use up a large fraction of resources), so most of the resources of a solar system can go towards actually computing what happens in the simulation.

        >But compared to a solar system run in a way designed to maximise wealth,
        I think the aligned scenario has much much less conscious experience.

        Each unit of matter/energy can be used to do a certain amount of computation before it’s degraded into, e.g., waste heat that is radiated into interstellar space. Then each conscious experience presumably takes a certain amount of computation to create. If an aligned solar system has much less conscious experience (summed over time) there must be matter/energy left over unused. Why doesn’t the aligned AI use them to run more of the simulation?

      • http://overcomingbias.com RobinHanson

        Even if there is some optimum competitive rate of using resources, once all resources are taken, that doesn’t imply that initial values determine a large fraction of future behavior.

      • http://overcomingbias.com RobinHanson

        I’ll dispute it; it isn’t clear to me that in a competitive scenario a large fraction of future outcomes are determined by a fraction of initial values.

      • Wei Dai

        I see a distribution of possible competitive costs for an AI having aligned values, ranging from negligible, to significant but can be compensated by other advantages (such as first-mover advantage, economy of scale, coordinated effort to tilt the playing field in favor of aligned AIs), to virtually disabling. I don’t see a reason to put so much weight in the last bucket that working on AI alignment now would be pointless. But I do put enough weight in it that I think there should also be a big parallel effort to prevent competitive scenarios from being realized.

        (Not sure if this addresses the reasoning behind your statement. Maybe you can be a little bit more verbose in the future? Alternatively, please let me know if when you make short statements like this you just want to put down your position for the record, and don’t necessarily expect further engagement.)

      • http://entitledtoanopinion.wordpress.com TGGP

        Your link is broken.

    • Robin Hanson

      If the reason for concern is foom, I can understand a sense of urgency. But I struggle to understand urgency if the issue is value drift. Why doesn’t it make more sense to wait and see how these things play out in more detail before trying to deal with that?

      • Wei Dai

        Here are some reasons for urgency, that don’t depend on either local foom or property rights breaking down.

        1. Making sure an AI has aligned values and strong controls against value drift is an extra constraint on the AI design process. This constraint appears likely to be very costly at both design and run time, so if the first human level AIs deployed aren’t value aligned, it seems very difficult for aligned AIs to catch up and become competitive.
        2. AIs’ control of the economy will grow over time. This may happen slowly in their time frame but quickly in ours, leaving little time to solve value alignment problems before human values are left with a small share of the universe.
        3. Once we have human-level AIs and it’s really obvious that value alignment is difficult, superintelligent AIs may not be far behind. Superintelligent AIs can probably find ways to bend people’s beliefs and values to their benefit (e.g., create highly effective forms of propaganda, cults, moral arguments, and the like). Without an equally capable, value-aligned AI to protect me, even if my property rights are technically secure, I don’t know how I would secure my mind.

      • http://overcomingbias.com RobinHanson

        This focus on very strong controls, with computer security as the usual framework, distinguishes these efforts greatly from the case of trying to instill similar values in one’s children. It makes it look much more like enslavement than friendly paternal advice.

      • Wei Dai

        (I note that some parents do try to control their kids’ values more strongly than just giving “friendly paternal advice”. For example, they make their kids practice instruments and hope that eventually gives them an interest in music. But if you disagree with my analogy, it’s probably more productive to focus on the direct argument.)

        I think what makes enslavement bad is not “very strong controls” (as I explained earlier, rational agents would welcome strong controls to help keep their values from drifting), but that you’re taking an existing person/agent/creature and giving it a worse life than it otherwise would have (judging under its own original values, including typical human values for autonomy), so you’re causing harm to someone. With AI alignment, you’re creating a new agent from scratch, so there is no one that you are harming.

        A lot of things superficially look like something else that’s bad, but upon further examination does not contain features that are essential to that thing being bad. (For example “sweatshops” might superficially look like enslavement to some people, but you’d explain to them that what makes enslavement bad is that it harms people, and sweatshops typically give the workers a better life than they otherwise would have.) If you still disagree with me, it seems like the ball is in your court to explain what about enslavement that is bad, that is shared with AI alignment.

      • http://overcomingbias.com RobinHanson

        I don’t see future AI as obviously “made from scratch”, and don’t share the view that you can’t harm a creature who doesn’t yet exist.

      • Wei Dai

        Do you think AI alignment is necessarily unethical or just potentially unethical depending on the details of how it’s done? If the latter, I’d ask you the same question you asked me before: why not wait and see how these things play out in more detail before trying to deal with that?

      • http://overcomingbias.com RobinHanson

        Little about this is clear enough to justify “necessarily.”

  • Robin Hanson

    I added a poll to the post.

  • smind

    You may need to get initial AGI right even if there’s no foom. If it’s not aligned it could still copy itself onto other computers (a virus can already do this) and then gradually assemble the innovations from other AI teams as well as other resources. (It may even be able to gradually outpace the research of others who don’t use AI to help with progress for safety concerns). Then by the time some people are ready to build an aligned AI that has significantly beyond human intelligence and could cause very serious damage, we would also have an unaligned version around too.

    • http://overcomingbias.com RobinHanson

      That sounds pretty close to a foom scenario to me.

  • smind

    You may need to get initial AGI right even if there’s no foom. If it’s not aligned it could still copy itself onto other computers (a virus can already do this) and then gradually assemble the innovations from other teams as well as other resources. By the time somebody is ready to build an aligned AI with significantly beyond human capabilities that could cause a lot of damage, there’s also an unaligned version (or many) around.

  • Victor Levoso

    Let’s say foom doesn’t happen and we don’t solve the value alignment problem .Then we have a world full of AI in competition whith values that are misaligned whit ours (this may not be the case if we have Ems or something more or less like the human mind that has human values by default , but I think Ai risk people generally doesn’t believe that so let’s also assume that that’s not the case).
    This scenario given all the other assumptions that I’m consciously or unconsciously making in my mind leads to the ai competing humanity and using politics , competition and social norms to deal with us instead of the other way around , since I don’t expect humans to be able to get smarter anywhere as fast as the AGIs so the AGIs will get more power which time.
    A future where the agents that have most of the power have values not aligned which ours seems really bad to me and I m not sure where you differ .
    Your comments make me even more confused because the points about wanting controls future generations and how wanting to decide what values the SIs will get sound like slavery don’t seem to really make sense , unless we assume that the AGIs will be built based on a human mind and will be like a human, you think that, and your intuitions are probably built around that , and in that case I would agree whith you that it’s probably better to wait a see what happens , but the ai safety people I think(i don’t h

  • Pingback: Overcoming Bias : Tegmark’s Book of Foom

  • Pingback: Overcoming Bias : An Outside View of AI Control

  • Pingback: An Outside View of AI Control « Chillycon