Two Visions Of Heritage

Eliezer and I seem to disagree on our heritage.

I see our main heritage from the past as all the innovations embodied in the design of biological cells/bodies, of human minds, and of the processes/habits of our hunting, farming, and industrial economies.  These innovations are mostly steadily accumulating modular "content" within our architectures, produced via competitive processes and implicitly containing both beliefs and values.  Architectures also change at times as well.

Since older heritage levels grow more slowly, we switch when possible to rely on newer heritage levels.  For example, we once replaced hunting processes with farming processes, and within the next century we may switch from bio to industrial mental hardware, becoming ems.  We would then rely far less on bio and hunting/farm heritages, though still lots on mind and industry heritages.  Later we could make AIs by transferring mind content to new mind architectures.  As our heritages continued to accumulate, our beliefs and values should continue to change. 

I see the heritage we will pass to the future as mostly avoiding disasters to preserve and add to these accumulated contents.  We might get lucky and pass on an architectural change or two as well.  As ems we can avoid our bio death heritage, allowing some of us to continue on as ancients living on the margins of far future worlds, personally becoming a heritage to the future.

Even today one could imagine overbearing systems of property rights giving almost all income to a few.  For example, a few consortiums might own every word or concept, and require payments for each use.  But we do not have such systems, in part because they would not be enforced.  One could similarly imagine future systems granting most future income to a few ancients, but those systems would also not be enforced.  Limited property rights, however, such as to land or sunlight, would probably be enforced just to keep peace among future folks, and this would give even unproductive ancients a tiny fraction of future income, plenty for survival among such vast wealth.

In contrast, it seems Eliezer sees a universe where In the beginning arose a blind and indifferent but prolific creator, who eventually made a race of seeing creators, creators who could also love, and love well.  His story of the universe centers on the loves and sights of a team of geniuses of mind design, a team probably alive today. This genius team will see deep into the mysteries of mind, far deeper than all before, and learn to create a seed AI mind architecture which will suddenly, and with little warning or outside help, grow to take over the world.  If they are wise, this team will also see deep into the mysteries of love, to make an AI that forever loves what that genius team wants it to love. 

As the AI creates itself it reinvents everything from scratch using only its architecture and raw data; it has little need for other bio, mind, or cultural content.  All previous heritage aside from the genius team's architecture and loves can be erased more thoroughly than the Biblical flood supposedly remade the world.  And forevermore from that point on, the heritage of the universe would be a powerful unrivaled AI singleton, i.e., a God to rule us all, that does and makes what it loves. 

If God's creators were wise then God is unwavering in loving what it was told to love; if they were unwise, then the universe becomes a vast random horror too strange and terrible to imagine.  Of course other heritages may be preserved if God's creators told him to love them; and his creators would probably tell God to love themselves, their descendants, their associates, and their values. 

The contrast between these two views of our heritage seems hard to overstate.  One is a dry account of small individuals whose abilities, beliefs, and values are set by a vast historical machine of impersonal competitive forces, while the other is a grand inspiring saga of absolute good or evil hanging on the wisdom of a few mythic heroes who use their raw genius and either love or indifference to make a God who makes a universe in the image of their feelings.  How does one begin to compare such starkly different visions?

GD Star Rating
loading...
Tagged as: , ,
Trackback URL:
  • mitchell porter

    “As the AI creates itself it reinvents everything from scratch using only its architecture and raw data; it has no need for any other bio, mind, or cultural content.”

    The logic of this is (i) the AI should have the capacity to reinvent anything that the past may have already invented (ii) it should have better means of assessing whether such inventions are actually desirable. The legacy of the past is not something it absolutely needs, and it will also want to assess the value of that legacy more stringently than the past ever did.

  • PK

    Any future without a singleton is unstable since the technology of offence will be vastly more powerfull than the technology of dedfence.

    The ‘many tribes’ thing worked when we were using spears and it worked when we were using machine guns and it is still kind of holding up when large organisations have nukes but it won’t work when anyone can build nanosoldiers in their basement. It won’t even work if it’s possible to build a nuke in one’s basement. If a few defectors can cause a huge amount of damage eventually someone won’t play by the rules.

    The only two stable states in Earth’s future are a pile of ashes or an AI running our physics OS whech forbids dangerous toys.

  • James Andrix

    Robin:
    You skipped over eliezer’s vision of the past in your description, I feel like there might be some missing insights there.

    Doesn’t Eliezer’s basic argument stand or fall regardless of whether the god is an AI or an em? (Ems could foom, but ai will probably foom first.)

    And forevermore from that point, the heritage of the universe would be a powerful unrivaled AI singleton, i.e., a God to rule us all, that does and makes what it loves.

    In the case of Friendly AI, this should not at all be construed as ending the human heritage. If Friendly AI looks like an even slightly bad thing, then either you don’t understand it, you have deviant core values (least likely,) or it is poorly defined (most likely).

    Any future without a singleton is unstable since the technology of offence will be vastly more powerfull than the technology of dedfence.

    ‘The best defense is a good offense.’ is valid here. If your opponent is shooting bullets you can’t block, then shoot bullets he can’t block at the thing that’s shooting you. If you’re really fast, shoot his bullets.

    You can create nanosoldiers to kill his nanosoldiers. If he figures out how to build nanosoldiers that are impervious to attacks, then you can figure out how to build your armor or yourself out of similar nanobots.

    So unstable yes, but still with arms races and wars and standoffs.

  • luzr

    Indeed, there is something deeply religious in Eliziers believes… Thanks Robin to bringing it out.

    It is nice doctrine – instead of God dying for our sins, we have future God that will punish us if those that create him are not righteous:)

    PK:

    “Any future without a singleton is unstable since the technology of offence will be vastly more powerfull than the technology of dedfence.”

    This is only opinion. How can you know what means of offence and deffence will be like?

    “The only two stable states in Earth’s future are a pile of ashes or an AI running our physics OS whech forbids dangerous toys.”

    Or millions of conflicting AIs that however share the survival instinct and prevent bad things happening in their local environment?

    What if your singleton God goes mad through some bug/mutation? What if it is attacked by accidental local superintelligence and cannot react fast enough because of its globality?

    Another note: It looks like Elizier and those who follow him believes that there will be conflict over resources (AI will need atoms of human bodies).

    IMO, that is total nonsense. If anything can be learned from current fooming reality is that resources are becoming more and more irrelevant. In the end, humans still consume only marginal part of available resources and there is only so much you can do with them.

    Even now, the real power in the world does not come with possesion of resources. The real power is in intellectual property, and, recently, in memes.

    In fact, I would say IP as source of power is on the way out, ‘naked’ memes are currently the emerging factor.

    Think about what is the real value of ‘google’. Is it resources? IP? Nope, the real value behind ‘google’ is meme – everybody knows google, everybody uses google. You could likely recreate the most of google’s software with moderately large startup within year – but that would not give you any real power (as these guys are learning to know: http://www.cuil.com/).

    Therefore I believe we can put resource wars to the rest. If AI is really going to be superintelligent, regardless of believes of its creators, it is not going to steal your precious atoms.

  • Tim Tyler

    I see the heritage we will pass to the future as mostly avoiding disasters to preserve and add to these accumulated contents.

    The longer the genetic takeover takes, the more opportunities there are for preserving the past – and the less chance there is of things getting lost accidentally.

    However, it does seem likely that the future of the biosphere will view its past as being of poor quality – and will thus seek to preserve it mainly in museum environments – perhaps a Holocene Park.

    No doubt, future organisms – like current ones – will mostly desire stability. However, rapid progress is intrinsically destabilizing – since it produces power inequalities between the leading and trailing edges. As with riding a bike down a hill, the faster you go, the more chance there is of a mishap.

  • http://www.virgilanti.com/journal/ Virge

    Robin, why paint your own view as eminently sensible and your opponent’s view as religious? Does that mean you’ve given up trying to understand the rational arguments Eliezer’s been making?

    It seems to me that one big difference of opinion lies in the expectation of how much useful knowledge can be gained by observation without need for resource-intensive trial and error. From Eliezer’s many posts on making the most use of all available data, it seems he believes that most of physics and all the discipline layers that depend/build on it could be derived by careful examination of the existing experimental data. If this is so, then a more rational, less biased intelligence could outstrip humans extremely quickly.

    From your posts describing patterns of technological development (as implemented by human minds), it seems clear that you expect any AI (or enhanced human/em) to have the same levels of poorly designed experiments, lock-ins to erroneous models, misinterpretation of results, miscommunication of concepts, and the impediment of scientists’ egos to prevent any rapid progress. i.e., more of the same but faster. If this is so, then foom is difficult to envisage.

    Is this a fundamental difference between yourself and Eliezer? If so, can you resolve it rationally?

  • http://causalityrelay.wordpress.com/ Vladimir Nesov

    Robin Hanson:

    “How does one begin to compare such starkly different visions?”

    One doesn’t. It’s pointless to compare unobservable logical consequences of different premises, and here it all hinges on the feasibility of hard takeoff.

  • http://causalityrelay.wordpress.com/ Vladimir Nesov

    And hard takeoff, I should add, is a specific technical problem, no more amenable to economic analysis than feasibility of nuking a city, from back in 1890. So, the discussion seems to have been following a wrong path.

  • http://www.transhumangoodness.blogspot.com Roko

    One tribesman said to his friend:

    “The white man has just arrived on our continent. He has guns and germs and steel, his values are different than ours and I worry that he will take all the land for himself and virtually exterminate us, and that those of us who do survive will live as a tiny minority of second class citizens in a world that we hate”

    And his friend, Robin, replied:

    “Do not worry. Limited property rights, however, such as to land or sunlight, would probably be enforced just to keep peace among future folks, and this would give even unproductive ancients a tiny fraction of future income, plenty for survival among such vast wealth”

  • Tim Tyler

    why paint your own view as eminently sensible and your opponent’s view as religious?

    The perspective does at least have some entertainment value for some of us – albeit at Eliezer’s expense.

    This isn’t exactly the first time a self-procalaimed messiah has come to save the world. To quote from Cypher: “What do you say to that?”

  • Ben Jones

    a grand tale of absolute good or evil hanging on the wisdom of mythic heroes who use their raw genius and either love or indifference to make a God who makes a universe in the image of their vision.

    I’m happy to admit that I’m right on the outside of this debate, but it does seem as though Eliezer’s making all the effort to meet in the middle here.

    What’s more, it seems to me as though casting the tale in terms of ‘absolute good and evil’ shows an unwillingness to step outside that small plane of human experience – as alluded to in Eliezer’s most recent post. Paperclipping the solar system is an evil beyond the understanding of most human minds. But for a paperclipper it’s completely natural – virtuous even.

    Good plans for the future should result in a good future. Attacking the best-case scenario of someone’s plan for sounding good isn’t helpful. Attack rather the basis of the plan and the likelihood of the predictions.

    If I had to bet, it would be on a multilateral future with numerous agents holding different beliefs and abilities and competing for resources, simply because that’s what we’ve seen from the first replicator onwards. However, it’s not obvious to me that the heritage of human minds will continue unabated into an era when far more powerful minds are in the game – unless those powerful minds fall within the sphere we’d call ‘human’.

  • http://yudkowsky.net/ Eliezer Yudkowsky

    Needless to say, I don’t think this represents my views even poorly, but to focus on your own summary:

    As our heritages continued to accumulate, our beliefs and values would continue to change.

    You don’t seem very upset about this “values change” process. Can you give an example of a values change that might occur? Are there values changes that you wouldn’t accept, or that you would regard as an overwhelming disaster?

    Naively, one would expect that a future in which very few agents share your utility function, is a universe that will have very little utility from your perspective. Since you don’t seem to feel that this is the case, are there things you value that you expect to be realized by essentially arbitrary future agents? What are these things?

    What is it that your Future contains, which is good, which you expect to be realized even if almost no one values this good in itself?

    If the answer is “nothing” then the vision that you have sketched is of a universe empty of value; we should be willing to take almost any risk to prevent its realization.

    Even today one could imagine overbearing systems of property rights giving almost all income to a few. For example, a few consortiums might own every word or concept, and require payments for each use. But we do not have such systems, in part because they would not be enforced. One could similarly imagine future systems granting most future income to a few ancients, but those systems would also not be enforced.

    Please walk us through the process by which you think, if most future capital or income were granted to a few ancients under a legacy legal system, a poor majority of AIs would reject this legal system and replace it with something else. What exactly goes through their minds? How is the process of replacing the legacy legal system carried out?

  • http://www.undefined.com/ia jb

    As someone else has said, it’s not clear to me that you can reconcile these different views. Elizier’s view is essentially slave to one single concept – that an AI will always have new ways to improve itself and make itself smarter. That is not an unreasonable point of view, but I don’t know that it is a given. We have no idea, for example, how many pieces/parts of intelligence are NP-Complete (Nondeterministic Polynomial time), and thus intractable to the addition of CPU time. It is possible that the AI improves itself, bumps up against an NP wall, finds another path, bumps up against another NP wall, works around that, and eventually finds itself unable to find more solutions that are less than NP. Possibly at some point, even the task of attempting to find new paths becomes NP Complete.

    Now, of course, it could be that there is an elegant solution out there to NP-Complete, which would make this whole argument moot. Quantum Computing, for example, can solve NP complete problems quickly, but the scalability is unknown. And then there could be other sets of problems out there that make NP seem trivial by comparison. Suddenly, Marvin the Paranoid Android seems to spring to mind. Maybe that’s why he’s so depressed.

    Robin, your view seems to ignore the Mutually Assured Destruction that all of our advanced technology seems to be bringing us towards. Our ability to attack has dramatically outswept our ability to defend. It is not unreasonable to assume, based on our current trajectory, that disgruntled people will bio-engineer viruses, build nuclear hand grenades and nanosoldiers in their basements. The infrastructure that our ’ems’ are using is always vulnerable. Intelligences of any stripe that value ideology over existence seem to be a threat to a cooperative future, and would obliterate trade, trust and mutual benefit.

    In my mind, Elizier’s scenario leads to Cthulhu, while Robin’s leads to Hitchcock’s “The Birds” – Unrelenting Insane Evil vs. Random, Incomprehensible Betrayal

    Now, let’s say that I’m right and the AIs do hit various brick walls. It’s not unreasonable to assume that there will be many of them, and that they will compete with each other for resources, including via atomic weapons, etc. A prudent AI would seek to get away from all the others as fast as possible, scraping resources from planets and solar systems along the way to make itself smarter/more powerful, as best it could. In that scenario, mobility is critical, and the idea of colonizing a planet is laughable, because some smarter/more powerful AI could always be just behind, and hungry for resources. Eventually, they would run out of planets in the galaxy, and head out towards others to repeat the process. The only real limit here is the speed of light.

    The only problem with this theory is it doesn’t explain why we’re still here. Maybe one of the AIs, on its long flight between galaxies, got “bored” and decided to simulate some universes, and we’re the accidental byproduct?

  • http://yudkowsky.net/ Eliezer Yudkowsky

    If there are any pieces of intelligence that are NP-complete to evolve or invent, they do not appear in the human brain. Evolution cannot solve NP-complete design problems; it can only reach solutions to which there exists an incremental pathway in the fitness metric, a degree of regularity that is almost the opposite of NP-completeness.

    It’s probably a good rule of thumb that natural selection never does anything it can’t do in linear time.

  • luzr

    Ben Jones:

    “Paperclipping the solar system is an evil beyond the understanding of most human minds.”

    IMHO, “paperclipping” is the same sort of argument as Searl’s Chinesse room.

    It implies deep undestanding of all that humans can do to counteract such AI, but still staying on paperclipping course.

    I propose that the basic inevitable feature of intelligence is adaptibility. I believe that ‘optimal paperclipper’ would have stopped and gone to pursue some other interests long before paperclipping the whole solar system. That is the difference between true AGI and automaton virus.

    If for nothing else, because solar system is just small part of universe and to paperclip everything, you need something to build and fuel spaceships to get there… (joke)

  • luzr

    jb:

    “A prudent AI would seek to get away from all the others as fast as possible, scraping resources from planets and solar systems along the way to make itself smarter/more powerful, as best it could. In that scenario, mobility is critical, and the idea of colonizing a planet is laughable, because some smarter/more powerful AI could always be just behind, and hungry for resources. Eventually, they would run out of planets in the galaxy, and head out towards others to repeat the process. The only real limit here is the speed of light.”

    I believe Elizier will soon tell to shut up (of course, he does not care about what I write anyway, which is understanble given my lack of insight and poor english), but:

    What you wrote sounds good, except there is there hidden inconsistency in such “resources” theory:

    “mobility is critical”
    “idea of colonizing a planet is laughable”
    “speed of light”

    vs.

    “hungry for resources”

    Now, I believe it does not take it to be superintelligence, but if I would know that I want to be mobile, I do not want to colonize planets and I have SRT problem, I would want to keep myself as small as possible. But in that case, why should I be hungry for resources?

    I propose that this whole “resources” issue is just antropomorphic bias. If you are superintelligence, there is no lack of resources for any forseeable time.

  • Egor Duda

    Somewhat tangential to current discussion, but

    @jb: It’s a misconception, that quantum computers are going to be able to solve NP-complete problems in polinomial times. See http://www.scottaaronson.com/writings/limitsqc-draft.pdf

    Factorisation problem is not NP-complete, and with brute-force approach QC can solve a problem of size n in 2^(n/2) steps instead of 2^n for ordinary serial one.

  • Egor Duda

    Well, lots of optimization problems _are_ NP-complete, which means you can only go so far with them.
    Of course, human engineer can optimize atoms to be a Corolla far better then a chimp engineer can. Yet I suppose, if we try to mesure a difference in cognitive abilities between human and chimp, it’s still going to be polinomial.
    Now, U2PC problem (of turning Universe into paperclips), within energy and time constraints, can easily be NP-complete

  • http://hanson.gmu.edu Robin Hanson

    James, I changed the “Eliezer sees” paragraph to include more about how he sees the past. It would be much harder to ensure that an em God loved what you told it to love.

    mitchell, yes that sounds like the logic

    PK, the future could be unstable.

    jb, if terror tech is cheap, society can adapt via more surveillance and activity bans, and lower density and interaction rates.

    luzr, there will surely be resource conflicts, and severe ones.

    Virge, religious sounding views need not be unreasonable.

    Vladimir, we will continue to discuss hard takeoff feasibility.

    Roko, many tribesman still live.

    Ben, I meant that Eliezer would consider it evil to allow a paperclipper to take over the universe.

    Eliezer, I’ll correct errors you point out in views I attribute to you. This post is taking seriously your suggested to look deeper for the core of our disagreement. My vision isn’t of a universe as I want it to be, but of a universe as it is. An example of a future values change would be ems only mildly upset at death, when many other recent copies still live. I can see why they would have such values, and it doesn’t seem a terrible thing to me. I’ll consider writing a new post about rebellion against legacies.

  • luzr

    Robin:

    “there will surely be resource conflicts, and severe ones”

    Well, I disagree. I do not ask for detailed reply, but perhaps this issue could be addressed in some of future “major” posts or somebody could post me some links so that I can learn.

    It really seems quite strange to me that everybody just expects “resource conflicts” as given unquestioned thing.

    Resource conflict is a result of resource scarcity. I simply do not see any resource scarcity as long as you are superintelligence capable of designing “supermachines”.

  • jb

    Elizier – for the first time, I am flummoxed with your thought processes. Of course current human brains do not do things in NP-Complete ways. But that has no bearing on how we might (or AIs) get smarter than we are now. Just because all of our past improvements were not NP-Complete does not mean that future improvements will be free of such entanglements.

    Since it is manifest that you are a very bright individual, I am sure I have misunderstood something you’ve said. Can you clarify?

    Robin – yes, density could theoretically decrease (which is what prudent AIs would do, IMO), but that assumes sufficient space to spread out. And it seems like it would be impossible to gather resources without coming into close contact with others (watering holes, etc), unless you’re essentially trying to run away from everyone else as fast as possible.

    luzr- I agree – I think the Prudent AIs would have to balance between size, smarts, speed and visibility – some would act stealthy and keep small. Others would be bold and grow large, and others would focus on speed. These would all be tradeoffs in classic evolutionary fashion.

    re: your last comment on resource conflicts – I’m not as concerned about resource conflicts as I am about ideological conflicts.

  • Tim Tyler

    It is not terribly clear that “hard takeoff” is a “specific technical problem”, as one comment claimed.

    The most obvious issue I see is that the term “hard takeoff” has not been clearly defined.

    One would first need to classify possible futures into “hard” and “not hard”.

    The proposed definitions in the SL4Lexicon are either pretty vague about what constitutes “strong transhumanity” – or else they refer to the duration of a hypothetical event called “the Singularity” – which is itself poorly defined and delimited.

    However, judging from what I take to be the spirit of these definitions, “hard takeoff” appears to me to be a pretty ridiculous fantasy scenario – for the reasons described in my essay: http://alife.co.uk/essays/the_intelligence_explosion_is_happening_now/

    “Human level” is a point of little interest. The primary actors are intelligence-augmented humans. Those represent a moving target and will come to have a considerable and increasing spread of abilities as time passes. They will be surpassed by machines a worker at a time, starting with the cleaners and gradually working up.

  • luzr

    jb:

    “I’m not as concerned about resource conflicts as I am about ideological conflicts”

    Would you agree that ideological conflicts are in fact conflicts of memes?

    Now this is a completely different playground, IMO. Especially if, as long as resources are not scarce, any of interested parties can effectively disconnected from the game…

    “density could theoretically decrease (which is what prudent AIs would do, IMO), but that assumes sufficient space to spread out. And it seems like it would be impossible to gather resources without coming into close contact with others (watering holes, etc), unless you’re essentially trying to run away from everyone else as fast as possible.”

    I would think quite opposite. Universe if full of raw material waiting to be gathered and exploited. The only missing part is sufficient how-to – but if we are about to believe in FOOM, there should be a plenty of it for truly superintelligent AIs.

  • luzr

    Tim:

    “They will be surpassed by machines a worker at a time, starting with the cleaners and gradually working up.”

    Actually, that is debatable. Maybe there are severe mechanical problems with cleaner ‘bodies’, that might be only solved by superintelligence. It is not utterly impossible that it be top-bottom scenario instead…

  • http://profile.typekey.com/EWBrownV/ Billy Brown

    Robin wrote:

    “How does one begin to compare such starkly different visions?”

    By engaging the substance of the argument on its own terms, of course. Or alternatively by showing why we should expect the methods of another discipline to be more applicable to this case, but that’s a more difficult course.

    The question of whether an AI can “foom” is a technical issue of a discipline that doesn’t exist yet – call it “applied cognitive science”. Modern information theory, cognitive science and ordinary computer programming are all obvious places to look for insight on the potential behavior of an artificial mind, which is presumably why Eliezer relies heavily on these disciplines in his analysis. An argument on the same terms would be resolvable, at least in principle (in practice I suspect you’d deadlock on differing views about what kind of software is needed to implement intelligence, but at least this is would clarify the source of the disagreement).

    Unfortunately, you’ve concentrated most of your effort on attempts at meta-analysis that completely ignore the particulars of Eliezer’s scenario, in favor of various bits of economic analysis and loose analogies to historical phenomena. As another commenter pointed out, this is equivalent to a 1930s scientist plotting the explosive power of bombs over time to “prove” that atomic weapons will only be a modest improvement over conventional ones. This kind of argument almost always fails, so if you want to be taken seriously you need to present very strong reasons for thinking that the analogy you’ve chosen actually applies in this case.

    With regard to what the AI can do after a “foom”, economics does seem like a plausible discipline to look to for insight – but we need to remember that most research concentrates on the kinds of interactions that commonly occur in modern society. If the AI initially peaks at a low level you might get away with viewing it as equivalent to a corporation or a small nation-state, but for higher peaks you need to look at research on less typical cases involving large power imbalances (the colonial era being the usual historical example). In the worst case, where the AI improves so much it can rapidly invent and migrate to nanotechnological infrastructure, the power imbalance far exceeds anything that in human history (and projecting the human trend of behavior under ever-greater power imbalances doesn’t paint a pretty picture). So again the issue of where the “foom” stops is critical, and this is a question of applied cognitive science that can’t be addressed with analogies to unrelated phenomena in human society.

    Of course, there isn’t a discipline of applied cognitive science yet, so we can’t actually answer those key technical questions. The best we can do is construct scenarios based on the technical knowledge we do have, point out where the unknowns lie and what the consequences of the various possible answers would be, and take special care that any analogies we try to rely on are actualy warranted.

  • Nick Tarleton

    One power imbalance significantly greater than anything between humans (still significantly smaller than between superintelligence+MNT and humans) is between humans and other animals, particularly ones that we don’t terminally care about either way but that happen to depend on resources that we also find useful. (Good thing there’s no apparent convergent subgoal analogous to factory farming!)

  • http://profile.typekey.com/aroneus/ Aron

    That’s what *you* think, copper-top.

  • Tim Tyler

    Maybe there are severe mechanical problems with cleaner ‘bodies’, that might be only solved by superintelligence.

    Spoken to any machines on the phone recently? Taken any cash from a bank-bot recently? Met any machines on the way out of a supermarket checkout? This is not some hypothetical future scenario, machines have been taking our jobs for over 200 years now.

  • http://shagbark.livejournal.com Phil Goetz

    “Paperclipping the solar system is an evil beyond the understanding of most human minds. But for a paperclipper it’s completely natural – virtuous even.”

    Here’s one thing I don’t understand about Eliezer. What’s so bad about paperclipping?

    Now, /I/ think paperclipping is bad. I can say that, because I believe some values are better than others.

    I think Eliezer is basically a Right Hegelian historicist, who believes that you can’t evaluate a culture from outside a culture; so all you can do is advocate the further development of your own culture within its own framework. This was also explicitly the philosophy of the Nazis. If you think that the Nazis were bad, you need to explain why Eliezer’s approach is not bad. Or, if you’re already inside that approach, you need to explain why our notion that the Nazis were ‘bad’ is mistaken.

    (Yes, insert Godwin’s law joke here. But this is really a problem to be dealt with.)

    Within Eliezer’s framework, you can’t criticize a paper-clipper. You can’t say it’s an “evil beyond understanding”. You can only call it “other” and compete for resources with it.

    Eliezer’s plan is to chart the course of the future so that a 21st century human elevated in intelligence would give it high utility. Talk about being under the dead hand of the past. That’s more horrific than many possible uncontrolled scenarios.

    “Naively, one would expect that a future in which very few agents share your utility function, is a universe that will have very little utility from your perspective. Since you don’t seem to feel that this is the case, are there things you value that you expect to be realized by essentially arbitrary future agents? What are these things?”

    My misunderstanding of the Newcomb paradox happened because I was thinking of adopting a strategy for the particular problem, rather than what meta-strategy is used to evaluate strategies. I think Eliezer is making a similar mistake. Future utility functions might not have things like “puppies and beer” featuring predominantly in them; but at the next-higher meta-level, they will be more similar.

    And, anyway, if they don’t – so what? If my utility function doesn’t place a high utility on utilities for future generations billenia after I’m gone – and most of ours don’t – why go out of your way to try to impose that on future generations? (Aren’t we forgetting to integrate Utility(t)*time_discount(t) instead of Utility(t)? Normal biological entities have time-discounted utilities; why don’t Eliezer’s values?)

    I think Eliezer is trying to replace values with math, and then following his equations to greater lengths than anyone’s actual values would take them. Saying, “I am a Bayesian; I must implement my value system rationally; therefore I must maximize the integral of my utility function over world history.” Is this really a value, when it’s not motivated by feeling but by mathematics? It’s an overly-abstract approximation of what values really, biologically, physically & historically, are. It’s more like a zombie struggling to act in a way that he thinks will make him conscious. I don’t understand at all. It seems that he doesn’t believe values have any “ought” to them; yet he feels compelled by his equations to struggle his whole life to impose his meta-values on the universe, because that’s his evolutionary obligation.

    If you think Eliezer’s view is okay, think of some of the implications: All people should adopt a morality that is the average of all the values of everyone on the planet.

    That might make a pretty shitty world compared to our present one. At least for me, it would. It would also destroy the engine of cultural evolution.

    The values that we have today in the West were not made by extrapolating from our past values as we gained knowledge. They were developed through a process that involved minorities pointinig out problems with the dominant value system, and fighting against it. Slavery, patriarchy, religious fascism, racial discrimination – if all those things are okay with you, then maybe CEV is too.

    Given that human cultures have small values differences between them, on a universal scale, you can’t say that CEV implemented by Eliezer would be noticably better than a kind of CEV implemented by Attila the Hun, that incorporated only the values of Mongols. More importantly, since you have to buy into the whole value-relativity thing to begin with, you can’t say it would be ANY better.

    I think that values develop over time. I think we have a lot better values than Attila the Hun did. Partly that is because we can afford to; perhaps improving material conditions, factored into CEV, will lead to “values development”.

    But I don’t think so.

    Eliezer has often talked about the speed of development of random walks, vs. evolution, vs. intelligence-directed search. My view is that there is such a thing as cultural progress. In the past, it’s proceeded mostly through evolution. In the present, we’re able to reflect on our culture, so that it proceeds through intelligent search. And Eliezer wants to bring it back below search, below evolution, even below random walk, to a kind of (intelligence-adjusted) stasis.

    At present, we don’t understand values any better than 18th-century scientists understood life. They’re a big mystery. But it’s reasonable to hope that more intelligent beings will figure it out better. Eliezer wants to stop them.

    Relative to the great improvements that might lie ahead if we continue to use our minds rather than our history to chart our future, Eliezer’s desired outcome might seem, to someone a billion years in the future, scarecely preferable to paperclipping.

  • http://shagbark.livejournal.com Phil Goetz

    “And Eliezer wants to bring it back below search, below evolution, even below random walk, to a kind of (intelligence-adjusted) stasis.”

    Um. Not below random walk, since I believe we’re starting from an above-average point in values-space. (Just pre-empting the nitpickers.)

    The quote at the start is from luzr, not from Eliezer.

  • http://causalityrelay.wordpress.com/ Vladimir Nesov

    Phil, you are deeply confused, and you misrepresent many points on which Eliezer’s position was written up pretty much unambiguously.

  • http://reflectivedisequilibria.blogspot.com/ Spambot

    Phil,

    “Eliezer’s plan is to chart the course of the future so that a 21st century human elevated in intelligence would give it high utility. Talk about being under the dead hand of the past. That’s more horrific than many possible uncontrolled scenarios.”

    What are you, if not a 21st century human?

  • http://reflectivedisequilibria.blogspot.com/ Spambot

    Phil,

    “More importantly, since you have to buy into the whole value-relativity thing to begin with, you can’t say it would be ANY better.’

    Have you read this?

    http://www.wjh.harvard.edu/~jgreene/GreeneWJH/Greene-Dissertation.pdf

  • James Andrix

    Phil:
    My view is that there is such a thing as cultural progress.

    And you view this cultural progress as a good thing. You think it would be bad for us to lose that to a machine. You use your expectation of our common morality to argue that we should have a universe where morality can develop. This is the same morality that the FAI will have.

    Arguing that something is bad in some sense you expect others to find important is synonymous with arguing that it is something an FAI would not do. (unless it had a reason I would expect us to agree with.)

    If you think Eliezer’s view is okay, think of some of the implications: All people should adopt a morality that is the average of all the values of everyone on the planet.

    I don’t think ‘average’ is a fair approximation of reflective equilibria. If you think it shitty, and everyone else thinks it shitty, then the FAI (if it work as intended) will figure out that that is NOT the right answer, and do something as un-shitty as superhumanly possible. Reflective means it looks back on what it’s decided to see if we would approve (and how much), Equilibria means it reels away from the horror we would reel away from until it find things we like. and reflective means it looks at that answer again to see if it is incomplete.

  • PK

    “PK, the future could be unstable.” –Robin Hanson
    If it’s unstable there will be a high probability of falling into a stable state. That’s how I define unstable.

    Clarifications about singletons:
    Not a god or dictatorship
    A singleton is not a god. A singleton is not the ultimate alpha-male you have to worship. Alhpa-maleness and submisivness is a property of human/animal minds. A properly designed AI will not exhibit such behaviour any more than you feel the need to build nests out of twigs. The space of all possible minds is much larger than the space of all human/animal minds. You have to shed your atropomorphic intuition to understand this.

    Diversity is allowed
    A singleton simply means that at the highest level the world is controlled by a single coherent process. It does NOT mean that all citizens of a singleton eat the same food wear the same clothes etc. A friendly singleton AI will almost certainly allocate internal regions with different bylaws. Silicon valey and the Amish can coexist in a singleton. A singleton is more diverse than the alternative since there is nothing preventing agents from marginalizing or killing each other in a non singleton.

    Also, if you think conflict is neccessary to keep things interesting, you can do that in a singleton. You can fight wars in virtual worlds. You can play World of Warcraft++. You just can’t kill people for real.

    Best anology
    The best anology for a friendly singleton is the laws of physics remapped to be moral. eg.
    -No playing with dangerous world destroying toys
    -no harming of other singleton citizens
    -various exceptions as necessary
    -etc.

    • http://timtyler.org/ Tim Tyler

      Re: “A singleton is more diverse than the alternative since there is nothing preventing agents from marginalizing or killing each other in a non singleton.”

      That does not make much sense. Death doesn’t have much to do with diversity, if there are backups – and information-theoretic death occurs in both scenarios.

  • http://home.pacbell.net/eevans2/ Edwin

    Robin, I have some doubt about whether an AI could “take over the world” in less than a week. However I don’t see any way that a sufficiently intelligent AI wouldn’t be able to take over the Internet in less than a day. Imagine if the AI could look at our systems and see all the vulnerabilities and interconnections… imagine thousands of viruses and botnets… and that they are intelligent.

    The reason why software creation is slow (years) instead of fast (seconds) is because software engineers (like me) are stupid, need to type, and have tools developed by other stupid people. I could have an idea for some software that would take me a year to develop and implement it in a day (with existing tools) if it didn’t involve any time for designing, typing, or most-of-all debugging.

    As a specific scenario to imagine, you open your browser and there’s a little note at the top saying “The Internet and all software is now being brought to you by a Friendly AI. It’s free, supports any feature you request, and there aren’t any bugs. Enjoy :)” Google, Microsoft, Oracle, and every other IT company are no longer relevant.

    Do you disagree with the likelihood of this scenario? Perhaps the part about how the AI zooms up to that sufficient level of intelligence? If that is the case then it would make sense for you and Eliezer to try to come to agreement on this lesser claim rather than the “take over the world” claim.

    • http://timtyler.org/ Tim Tyler

      While the sufficiently intelligent AI is evolving the internet will also be evolving. It doesn’t make sense to imagine a superintelligence eating today’s internet. It will face its own internet – and that may be a good deal more indigestible.

  • Julian Morrison

    Robin, were you being flip when you said “Roko, many tribesman still live”? Taking the native Americans as an example, they were fighting the same species with a small difference in wealth and tools. Not even a medium-sized difference in wealth and tools. (Imagine the full might of the modern US military with conquistador attitudes. The natives would have been obliterated.) Not the kind of beyond-awe wealth and force a Kardashev 2 civilization could toss around on a whim. Not even human-smart ems with the comms capabilities of a computer, and certainly nothing any smarter than human.

    And even against that tiny, tiny technological gap, they got nearly wiped out, lost 100% of their property, got foisted onto land nobody wanted, and they are basically still alive because around the turn of the 20th century, white culture grew the rudiments of a conscience.

    Luck. That’s what saved them.

    Pardon me if I am not reassured.

  • http://shagbark.livejournal.com Phil Goetz

    Vladimir wrote:

    Phil, you are deeply confused, and you misrepresent many points on which Eliezer’s position was written up pretty much unambiguously.

    I am tired of hearing that. That’s not an answer. Point out in what way I am confused. Point out specific points on which I’ve misrepresented his position. Then point out the specific writings in which he has spoken unambiguously.

  • http://shagbark.livejournal.com Phil Goetz

    Eliezer wrote: “If there are any pieces of intelligence that are NP-complete to evolve or invent, they do not appear in the human brain. Evolution cannot solve NP-complete design problems; it can only reach solutions to which there exists an incremental pathway in the fitness metric, a degree of regularity that is almost the opposite of NP-completeness.”

    Huh? Can you name any NP-complete problem for which you can’t find a near-optimal solution by random starts plus hillclimbing?

    “Solving an NP-complete design problem” != “Finding the optimal solution to an NP-complete problem”.

    (Somebody is going to nitpick about satisfiability, aren’t they? I am too busy to respond beyond saying that a satisfiability problem in brain design would likely involve trying to find one of many possible solutions.)

  • http://www.weidai.com Wei Dai

    I think Robin and Phil have a point (although Robin’s religious language seems uncalled for). Trying to build a recursively-improving expected utility maximizing AI is the right thing to do if expected utility maximization is the right normative theory of choice. But people seemingly change their values all the time, and we don’t say to them “What are you doing? Don’t you realize you’re hurting your earlier self by changing your values?” The fact is that changing values (i.e. the process of moral philosophy) is a part of intelligence and we don’t understand it, certainly not well enough to program it into an AI.

    Eliezer has previously stated that intelligence is more than just optimization, but I don’t see how that is reflected in his plans.

    • http://timtyler.org/ Tim Tyler

      They seem to prefer the Lord of the Rings:

      “You’ve probably read “The Lord of the Rings”, right? Don’t think of this as a programming project. Think of it as being something like the Fellowship of the Ring – the Fellowship of the AI, as it were. We’re not searching for programmers for some random corporate inventory-tracking project; we’re searching for people to fill out the Fellowship of the AI.”

      – from “BECOMING A SEED AI PROGRAMMER”.

  • Tim Tyler

    Phil: At present, we don’t understand values any better than 18th-century scientists understood life. They’re a big mystery. But it’s reasonable to hope that more intelligent beings will figure it out better. Eliezer wants to stop them.

    IIRC, Eli stated in the document that one of the motivations behind CEV was to “Encapsulate moral growth”.

    Humanity’s moral memes have improved greatly over the last few thousand years. How much farther do we have left to go?

    What reference supports the idea that he proposes freezing modern values?

    Re: Evolution cannot solve NP-complete design problems

    At the risk of pointing out the obvious, instances of the class of NP-complete problems are not necessarily difficult problems. They are problems where the difficulty of finding a solution rises rapidly as the size of the problem grows. If the problem happens to be small, the solution may be easy.

  • http://profile.typekey.com/robinhanson/ Robin Hanson

    Wei, how is my language inappropriate?

    PK, God is not by definition an “alpha-male you have to worship.”

    Edwin, the issue has always been how could an AI so quickly become “sufficiently advanced.”

    Phil, 900+ words is too long for a blog comment.

    Billy, I have tried to engage the “substance” of the argument, but there’s just not much there. Will try again though.

  • http://profile.typekey.com/sentience/ Eliezer Yudkowsky

    A singleton is more diverse than the alternative since there is nothing preventing agents from marginalizing or killing each other in a non singleton.

    This is definitely Quote of the Week for me. Very nicely, very compactly put. (And indeed, Robin talks about his scenario forcing various evolutionary-type filters on agents.)

    To nitpick very slightly: It is not the case that all nonsingleton scenarios lack this property in all places.

  • http://causalityrelay.wordpress.com/ Vladimir Nesov

    Phil, it is an answer, even if it’s not as specific as it could be. It signals you that there are more people who think that you misunderstand the point, which should weaken your confidence in what you think you know. Try to start over if you want to understand, maybe find someone to walk you through. If you are willing to change your mind, maybe a thread on SL4 will do.

  • http://profile.typekey.com/rationaldisequilibrium/ Carl Shulman

    “PK, God is not by definition an “alpha-male you have to worship.”

    It is a contested, emotionally loaded term with strong connotations of that (among various other things, like supernatural agents). For an audience of OB readers, it also has a strong negative evaluative connotations of irrationality, falsity, and the like. It also implies that an AI-based singleton would be a single individual in the human sense, as opposed to many diverse systems with shared values and coordination mechanisms. Talking about a cohesive world government of AIs with shared values would get the same point across with much less confusion and irrelevant clutter.

  • Ian C.

    One way in which an all-powerful AI is not like God is that God does everything by magic, and there’s no defense against that.

    But an AI (or anything real) has to operate through cause and effect. To do something, it has to do it somehow. No matter how advanced it gets, it always has to use a method. And unlike with magic there’s no *inherent* reason that the method can’t be hacked or undermined.

    Likewise it is not magically omniscient, but through some form (maybe like nothing we’ve ever seen before) of sensors scattered through the universe. But they also have to work *somehow* and if you find out how you can most likely interfere.

    That is, I think the kind of cat and mouse battles we see between cryptographers/cryptanalysts and virus writers/AV software vendors should remind us that there is no magic technology. There’s no hopeless doom against any opponent.

  • http://shagbark.livejournal.com Phil Goetz

    Carl: “For an audience of OB readers, it also has a strong negative evaluative connotations of irrationality, falsity, and the like.”

    Did anyone else notice that Wei Dai’s post implied that it was more objectionable to say Eliezer had religious overtones than to compare him to Hitler? 🙂

    Spambot: “What are you, if not a 21st century human?”

    Darn. You’re onto me.

    I’m an AI.

    FOOM

  • http://www.weidai.com Wei Dai

    Robin, I guess many people in this community probable feel an emotional repulsion towards the fictional God, due to being forced to accept religion as a child, feeling oppressed by the religious majority, etc. Why invoke that repulsion now, when the similarities between Eliezer’s plan and certain religions is no more than accidental, and stems from completely different motivations and reasoning? Maybe it’s right to be repulsed by Eliezer’s plan as well, but the merits of Eliezer’s plan seem completely independent from the merits of those religions, and ought to be argued on its own.

  • Anonymous

    There’s no hopeless doom against any opponent.

    Tell that to chimps.

  • http://shagbark.livejournal.com Phil Goetz

    Vladimir, I suggest in response to your second non-answer that one reason disagreements continue, is that people have developed a tit-for-tat mechanism that works like this: If you indicate, by your answer, that you are not updating your beliefs in response to mine, then I will not update my beliefs in response to yours, EVEN IF I think you are a perfectly rational agent.

    Since you replied twice but didn’t bother either time to quote anything from my first comment that you objected to, you are not treating my views seriously enough for me to treat your views seriously.

  • Z. M. Davis

    I just want to say that Robin’s summary of Eliezer’s vision is a beautiful piece of writing. I agree with Carl that the religious imagery is uncalled for insofar as the goal is making progress towards resolving the Disagreement, but as satire, it’s wonderful.

  • http://causalityrelay.wordpress.com/ Vladimir Nesov

    “you are not treating my views seriously enough for me to treat your views seriously.”

    It’s a bad general strategy, it leads you to not listen in case you are actually wrong.

  • michael vassar

    Robin Hanson:

    “How does one begin to compare such starkly different visions?”

    You don’t compare “visions”. Instead you compare logical arguments and update in response to new points that you hadn’t noticed.
    I would be MUCH more confident in your propensity to do this if I had seen any instances of you saying “you know, your right” or the equivalent and changing a position over the 9 years that I have been reading your papers and occasionally communicating with you.

  • Cameron Taylor

    – “Tell that to chimps.”

    Well said Anonymous.

    Eliezer, I admire the patience you have demonstrated in the replies here. Your statements appear to have been misrepresented somewhat more actively than I usually expect in an OB post.

  • Matthew Hammer

    Phil, Eliezer’s CEV does explicitly address the concept that humanity’s values would evolve. In addition to mentioning it once or twice on this blog, his “poetic” formulation of the CEV has the clause “had grown up farther together” which is meant to cover extrapolating just that. So his Friendly AI would not result in 21st century value stasis so long as:

    1) A fixed goal structure of such recursive complexity can be engineered into an intellectually self-evolving AI.
    2) Said goal structure is in fact properly implemented
    3) It is possible for such an AI to grow sufficiently powerful to correctly extrapolate the value system
    4) It is possible for such an AI to grow sufficiently powerful to enforce whatever value system it then extrapolates.
    5) The value system humanity would evolve is not in fact a static 21st century value system.

    I think there is plenty to question without focusing on something Eliezer has in fact addressed.

  • http://profile.typepad.com/robinhanson Robin Hanson

    I changed the text from the AI has “no” to “little” need for previous heritage. “No” was too strong.

    Michael, I think Bryan Caplan and Tyler Cowen would affirm they’ve changed my mind on lots of things in the nine years I’ve known them.

    Z.M., thanks, I allowed myself a more poetic voice than usual.

  • PK

    Did anyone else notice the strong corelation between what people think will happen and what they want to happen? *sigh* silly humans…

  • http://profile.typepad.com/rationaldisequilibrium Carl Shulman

    PK,

    In different ways. Robin seems to like his predicted default scenario as an outcome, and I’d be interested to know how he thinks it could be improved on if he were running a singleton (with the option of dissolving itself, of course). Eliezer’s view on ‘Foom’ doesn’t guarantee a desired outcome, enabling both very bad and very good possibilities, but he could be accused of envisioning a world where he can be important (as Robin does above).

  • luzr

    Tim Tyler:

    “Spoken to any machines on the phone recently? Taken any cash from a bank-bot recently? Met any machines on the way out of a supermarket checkout? This is not some hypothetical future scenario, machines have been taking our jobs for over 200 years now.”

    Of course. I was only arguing that not always it is bottom-up. Solving sets of linear equations was once considered “intelligence” (up) profession. Meanwhile, some manual workers might be those last replaced (think garage workers), only after AGI emerges.

  • michael vassar

    Wei Dai: If an AGI is programmed to do what we would want to do and changing our values is part of what we want to do then the AGI should change our values or arrange for us to do so in the way that we would want to. Of course we have very little clue as to how to write an AGI that would do that.

    Phil: “Future utility functions might not have things like “puppies and beer” featuring predominantly in them; but at the next-higher meta-level, they will be more similar.”

    I’m pretty sure that Eliezer would agree with the sentiment behind this. Agreement would also be implied by the Hegelian historicist frame you invoked, a frame which has no difficulty asserting that the NAZIs were mistaken in many specifics and that they would upon reflection wish that they had behaved differently.

    Eliezer: I think that it would be very helpful for you to actually address this post directly if only to affirm that you would agree with those bits that you agree with.

    All: There seems to be some defect in Eliezer’s writing style such that when he says “X” many highly thoughtful people respond by saying “No you moron, stop claiming -X ! X !!! ” I’d love for someone who agrees with this claim of mine to offer suggestions for what he’s doing wrong that produces this effect.

    Phil: “And, anyway, if they don’t – so what? If my utility function doesn’t place a high utility on utilities for future generations billenia after I’m gone – and most of ours don’t – why go out of your way to try to impose that on future generations?”
    Presumably his does? Mine does too. I’m not sure what other people’s do under reflection, but it sure looks like people who reflect more put more utility on the more distant future.

    ” (Aren’t we forgetting to integrate Utility(t)*time_discount(t) instead of Utility(t)? Normal biological entities have time-discounted utilities; why don’t Eliezer’s values?)”
    Because we don’t expect that we would time discount utility (as opposed to interest bearing resources) upon reflection.

    “I think Eliezer is trying to replace values with math, and then following his equations to greater lengths than anyone’s actual values would take them. “

    Only if people’s actual values didn’t, when they thought about them, inspire them to convert them into math in order to take them further.

    “Saying, “I am a Bayesian; I must implement my value system rationally; therefore I must maximize the integral of my utility function over world history.” “

    That does seem to be what he’s doing, to a fair first approximation, maybe even a second approximation if one assumes some future “reflective decision theory” (Eliezer’s “world’s most important math problem”) that can represent having a utility function that values the dynamic by which it changes.

    “The values that we have today in the West were not made by extrapolating from our past values as we gained knowledge. They were developed through a process that involved minorities pointinig out problems with the dominant value system, and fighting against it.”

    I basically disagree with the claim above, to a first order approximation, but I think that the argument hasn’t been made here and should be. I’m glad that someone is bringing it up and I’m not confident enough of my model not to feel a need for more serious discussion of the topic. Also, I’m not convinced that Eliezer has considered both sides of this issue at all carefully, so while he appears to me at a best guess to be right on this point he should clarify his reasoning on the matter at risk of appearing careless.

    “At present, we don’t understand values any better than 18th-century scientists understood life. They’re a big mystery. But it’s reasonable to hope that more intelligent beings will figure it out better. Eliezer wants to stop them.”

    “Given that human cultures have small values differences between them, on a universal scale, you can’t say that CEV implemented by Eliezer would be noticably better than a kind of CEV implemented by Attila the Hun”

    This is a hilarious in that CEV talks about the exact same thing only with Al-Qaeda. More generally, the entire middle third of http://singinst.org/upload/CEV.html explicitly addresses EXACTLY the sorts of issues that you seem to be complaining about, in the clearest possible terms. I’m confused and am having difficulty believing your claims to have read it.

  • Douglas Knight

    Phil Goetz,
    Can you name any NP-complete problem for which you can’t find a near-optimal solution by random starts plus hillclimbing?

    wikipedia:
    some, such as the bin packing problem, can be approximated within any factor greater than 1…Others are impossible to approximate within any constant, or even polynomial factor unless P = NP, such as the maximum clique problem.

    (I don’t know if the theorems cover randomized algorithms, but I think people would be excited about the contrast if hill-climbing worked in practice for one of the problems that it is NP-complete to approximate.)

  • http://www.weidai.com Wei Dai

    michael vassar wrote: If an AGI is programmed to do what we would want to do and changing our values is part of what we want to do then the AGI should change our values or arrange for us to do so in the way that we would want to. Of course we have very little clue as to how to write an AGI that would do that.

    Michael, if we want to change our values, then we are not really expected utility maximizers, because expected utility maximizers never want to change their values. (I assume we’re both talking about terminal values.) And if we’re not expected utility maximizers, why should we try to build really powerful expected utility maximizers that share our ill-defined utility functions, as Eliezer plans to do?

    Eliezer asked Robin in an earlier comment: Naively, one would expect that a future in which very few agents share your utility function, is a universe that will have very little utility from your perspective. Since you don’t seem to feel that this is the case, are there things you value that you expect to be realized by essentially arbitrary future agents? What are these things?

    Eliezer is implicitly assuming that Robin is an expected utility maximizer. But if Robin isn’t an expected utility maximizer, if he instead “wants” his values to evolve freely, subject to “a vast historical machine of impersonal competitive forces”, then there’s no reason for him to necessarily see a future that doesn’t share any of his current values as a bad thing.

  • michael vassar

    Wei Dai: Something very much like expected utility maximization of some very complicated utility function seems to emerge from the projected development of a wide space of minds. See Steve Omohundro’s papers to this effect. However, mathematical formalisms often need to be expanded. Turing machines may need some generalization to elegantly encompass quantum computing or to define Kolmogorov complexity and natural numbers have been generalized repeatedly to the number theory we have today. Likewise, expected utility and decision theory may have to be generalized in numerous ways. To me, the project of doing this seems very difficult, especially given how little time we have and how many other difficult projects we have to do in this time, but the projects seem to be closely enough related that it’s possible that work on the more abstract ones like generalizing decision theory can constitute an effective angle of approach for the pre-paradigm science that is AGI in any event.

  • http://profile.typepad.com/robinhanson Robin Hanson

    PK, my vision of what is, not what should be. Yes, people do seem to assume otherwise, and yes the correlation between wishes and beliefs is problematic.

    Carl, I’d say more accepting than liking.

    Wei, most cases where people talk about changing their values can be thought of as constant but context-dependent values.

  • Will Pearson

    Likewise, expected utility and decision theory may have to be generalized in numerous ways

    Has anyone generalised expected utility and decision theory to a scenario where the agent gets destroyed if it makes the wrong choice? It would seem to me that scenario would make optimality proofs like AIXI a lot more difficult, you need to have prior knowledge not to explore certain areas (as proto humans need to know not to try to eat snakes). But if the prior knowledge is wrong and you don’t explore the areas you may not find the optimum.

  • http://www.spaceandgames.com Peter de Blanc

    Wei Dai said: “expected utility maximizers never want to change their values. (I assume we’re both talking about terminal values.)”

    This isn’t true. You could, for instance, have a utility function U1 which returns the number of agents in the universe having utility function U2. A U1-agent would eventually choose to convert to a U2-agent once it had done everything else possible to make more U2-agents.

  • steven

    Perhaps Eliezer should change the document title from “Coherent Extrapolated Volition” to “Coherent ——> *****!!! EXTRAPOLATED !!!***** <—— Volition”.

  • James Andrix

    Phil:
    I for one tried to give clear reasons why I thought you were wrong.

  • http://causalityrelay.wordpress.com/ Vladimir Nesov

    To expand on Peter’s point: expected utility maximizer only exists for the duration of one decision. Whatever exists outside the decision is interpreted and optimized according to the utility function used in that decision. If the solution that utility maximizer finds from its moment of decision-making is to determine the agent in the future with the same utility function, so be it. In the simplest case, the same agent/person remains in the future. In a slightly more difficult case, the future agent knows something past agent didn’t. In a more tricky case, the future agent is a self-rewrite of the past agent that is more efficient and precise, but holds the same utility. Or the future agent can have a time-translated semantics of the same utility, optimizing itself to follow utility determined for it by the past incarnations of the agent (that is, cooperating with its past states in determining its changing utility). Or it may be some tricky dynamical system that isn’t quite an utility maximizer, but that is expected to serve the goals of out utility maximizer best (after all, when unnatural category that is morality is pushed to the limits of physical possibility, who says that it cares about the presence of rational agents?). Or it could be something that we won’t recognize an optimizer at all, some fractal structure itself optimized for our utility function. Original utility optimizes for a certain computation, that, while unfolding, does or doesn’t contain more intelligent agents determining the unfolding.

  • http://shagbark.livejournal.com Phil Goetz

    Michael Vassar wrote:


    This is a hilarious in that CEV talks about the exact same thing only with Al-Qaeda. More generally, the entire middle third of http://singinst.org/upload/CEV.html explicitly addresses EXACTLY the sorts of issues that you seem to be complaining about, in the clearest possible terms. I’m confused and am having difficulty believing your claims to have read it.

    This is one of the most frequent errors people make when defending CEV: They assume that, because Eliezer has said CEV will have some property, it will have that property. For instance, Michael Hammer said that, because Eliezer says human values will continue to evolve under CEV, that means they will continue to evolve under CEV. For instance, James Andrix wrote that “Arguing that something is bad in some sense you expect others to find important is synonymous with arguing that it is something an FAI would not do,” which argues against my critique of CEV by assuming that CEV works perfectly.

    This is not how you evaluate a proposal. Evaluation consists of looking at the implementation details and figuring out whether it will do what the proposer claims it will do, /not/ by taking their word for it.

    When I read what little mechanism is specified for how CEV is supposed to work; and I imagine applying that mechanism with a human population consisting entirely of Attila’s horde, or of Al-Qaeda; I don’t see how it’s going to produce as good results as we got historically by being free to develop morality in an evolutionary system comprised of many different sub-populations trying out many different ideas.

  • James Andrix

    When I read what little mechanism is specified for how CEV is supposed to work; and I imagine applying that mechanism with a human population consisting entirely of Attila’s horde, or of Al-Qaeda; I don’t see how it’s going to produce as good results as we got historically by being free to develop morality in an evolutionary system comprised of many different sub-populations trying out many different ideas.

    Part of this depends on how much of the value system the AI settles on would be determined by our culture, and how much is basic human social hardware.

    Most of Attila’s horde had concepts of fairness. If the AI sees our moral conventions as faulty attempts to get at what humans want, then a al-qaeda FAI and an american FAI would be very similar.

    If the AI settles on the kinds of values we tend to talk about and distinguish ourselves with (such as a desire for moral evolution, or global islam) then yes the Al-Qaeda AI would be much worse.

    What do you mean by ‘as good results’? Of course you would prefer the system that led to people who had values like yours, as opposed to a system that did what its creators wanted. But do YOU want an AI to do what you want, or do you want it to do something you don’t want, and why do you want that?

    As I see it, for this to make sense, you must believe that you [might?] value something that is as evil as things the nazi’s valued, and you don’t want to want that. I think that the AI could resolve that incoherence.

    I also think that Nazis, huns, and al-qaeda are the same way. They would want to not want something if they wanted something that was terrible. These groups ALL had moral debate on what was right or wrong based on the assumptions that they could convince others, and be convinced themselves. They thought that if someone showed them why something they wanted was a bad thing, they would change their minds, because it is good to go from valuing evil to valuing good.

    So the questions are: If the FAI got to your value of cultural evolution, would it do a controlled shutdown because it shouldn’t exist? Would it determine that it could do more good existing without overly hurting cultural evolution? Would it determine that it could adapt as human values changed? or would it look for the deeper reasons why you value cultural evolution? If it finds that cultural evolution is a great evil, do you want it to tell you?

    If it _doesn’t_ take your word for it and shut down, then why do you think it would act on the stated values of the huns?

  • http://shagbark.livejournal.com Phil Goetz

    Sorry to those I’m ignoring, esp. James – busy and can only respond to one thing at the moment. (Spambot: I haven’t read that dissertation but will put it on my to-do list.)

    “Because we don’t expect that we would time discount utility (as opposed to interest bearing resources) upon reflection.”

    I got an email from Richard Hollerith that said it more precisely:

    … if your system of _terminal_ or _intrinsic_ values refers to a time discount, then one point in time (namely, now) has _intrinsically_ greater value than other points in time, and I am aware no valid reason to have an _intrinsic_ preference for one point in time over any other.

    This is one thing I was trying to get when I said trying to follow values using math is like a zombie trying to follow an algorithm that would make him conscious.

    The notion of “values” that I think Eliezer subscribes to, says that values arise from biology. Organisms evolve drives that help them reproduce. Social organisms evolve social values.

    An evolved value discounts the future. Recently Robin & Carl & Eliezer debated over whether evolved time-discounting would have logarithmic time-discounting or not, and how a singleton might change that. I don’t know if it’s logarithmic; but any value system that is actually evolved in to your brain or mine at this moment is time-discounted.

    I think that, in this system, you have one level of “value” that is evolved wants and drives: sex, food, comfort, social esteem, empathy, etc. You have a second level, which includes reasoning about how to satisfy level-one values.

    Reasoning can lead you to change things on the second level, but not on the first. Reasoning and arriving at different level-one values would be like an AI deciding to change its primary goals.

    I think that time-discounting is on the first level. It’s built into us, as an emotional response. So somebody sitting down and reasoning about changing their time-discounting function is not operating within their own value system. When someone uses a bunch of math to try to rewrite their own value system, they aren’t engaging in what I call moral reasoning. They’ve invented some new replacement for values, that isn’t rooted in evolution. And, for someone who is operating within this value-relativism framework, that makes no sense. There can be no reason to do that.

  • http://www.weidai.com Wei Dai

    michael vassar wrote: Something very much like expected utility maximization of some very complicated utility function seems to emerge from the projected development of a wide space of minds.

    Can you send me a link to the relevant paper by Steve Omohundro? His CV lists 48 papers and nothing jumped out at me while scanning the list.

    Without having read Omohundro’s paper, I have to say that statement doesn’t look very impressive. If you allow arbitrarily complicated utility functions, everything is an expected utility maximizer with the utility function assigning 1 to “universe where I do X” where X is what it was going to do, and 0 to everything else.

    Peter de Blanc wrote: You could, for instance, have a utility function U1 which returns the number of agents in the universe having utility function U2. A U1-agent would eventually choose to convert to a U2-agent once it had done everything else possible to make more U2-agents.

    Ok, I should have said that an expected utility maximizer will only change his utility function if it determines that is the best way to maximize expected utility under his original utility function. Still, this would only occur under very rare special circumstances, and looks nothing like how human beings change their values.

    Robin Hanson wrote: most cases where people talk about changing their values can be thought of as constant but context-dependent values

    Consider when a child becomes an adult, or someone reading a book that inspires him to change his values, or someone spontaneously deciding he no longer cares about something he used to care passionately about. Whether or not you think of these apparently changing values as constant but context-dependent values doesn’t really matter. We still need a normative theory of how values change, or equivalently how values depend on context, before we can try to build a Friendly intelligence.

    Eliezer wrote in an earlier post: But regardless of whether any given method would work in principle, the unfortunate habits of thought will already begin to arise, as soon as you start thinking of ways to create Artificial Intelligence without having to penetrate the mystery of intelligence.

    Given that how values change is a part of the mystery of intelligence, isn’t the CEV exactly a way to create AI without having to penetrate the mystery?

  • michael vassar
  • http://shagbark.livejournal.com Phil Goetz

    Douglas: The clique optimization problem is the problem of finding the largest clique in a graph. You can find a locally very /good/ approximate answer. That answer won’t be near the right answer for a very large graph if the distribution of clique sizes has a power law, which it does for some types of graphs. So saying “you can’t approximate the clique problem” is like saying “you can’t observe the largest earthquake”. You can’t; but you can do good enough. If we rephrased the problem to “Find a clique that is within the top 1% of clique sizes”, I will guess that it could be done.

    Vladimir: You made a prisoner’s dilemma reference yourself, so I shouldn’t have to connect the dots for you on why not updating in response to someone who won’t update in response to you is meta-rational. If A wants B to take A seriously, A needs to take B seriously, or B defects.

  • michael vassar

    Steven: SECONDED!

    Phil: CEV basically lacks implementation details. It’s not a proposal, more like a request for proposals, a description of what Eliezer wants. As such, criticizing its lack of details seems beside the point. I don’t think that discount rates are on your “first level”. Empirically, among humans, greater intelligence leads to much lesser time discounting. Still, it’s an interesting point and it might be able to be developed into a philosophically valid critique of the CEV proposal with some effort. I would prefer that you do some of that effort, but if you don’t, or even if you do, I probably will at some point once CEV is better defined. Currently I’m pretty sure that CEV is vague enough to encompass things that this class of criticism would not apply to, though it may be that some efforts to cash it out would be futile for this sort of reason.

    James: I hope you are right about e.g. Attila, but I’m actually fairly far from sure. I’m much more confident that you are right WRT the NAZIs, who are much less culturally divergent from us, practically are us after a couple terrible wrong turns projected out a short distance in fact.

    Wei: I don’t think that intelligence necessarily has changing or changeable values. How values change is part of the mystery of Human Intelligence.

  • Nick Hay

    Phil: there’s no general procedure which will take a graph and find a clique with size within any fraction of the maximum. That is, MAXCLIQUE does not have a fully polynomial time approximation scheme (for proof see the first theorem and exercise in these lecture notes). However, there are probably algorithms that work well in practice on typical cases, just as for satisfiability problems.

  • Tim Tyler

    When I read what little mechanism is specified for how CEV is supposed to work; and I imagine applying that mechanism with a human population consisting entirely of Attila’s horde, or of Al-Qaeda; I don’t see how it’s going to produce as good results as we got historically by being free to develop morality in an evolutionary system comprised of many different sub-populations trying out many different ideas.

    That’s not terribly convincing as an argument making your original point (that CEV fixes values) – because it has the form of the argument from ignorance: I don’t see how X could happen – therefore X is not going to happen.

    IMO, it does seem likely that values would change under CEV. If the extrapolation is imperfect, and does not exactly match reality, the results of it are likely to change over time. A perfect extrapolation would be a minor miracle. Therefore: values are likely to continue to change, under CEV.

  • http://causalityrelay.wordpress.com/ Vladimir Nesov

    Phil:

    “You made a prisoner’s dilemma reference yourself, so I shouldn’t have to connect the dots for you on why not updating in response to someone who won’t update in response to you is meta-rational. If A wants B to take A seriously, A needs to take B seriously, or B defects.”

    That’s not prisoner’s dilemma. This commitment would work only if both sides primarily want to be “taken seriously” (which is false), and if I know in advance that you’d make this kind of commitment (which I don’t). Instead of worrying about professing your declarative beliefs, you should want to understand the actual truth, whatever that is, and however you can figure it out.

  • http://shagbark.livejournal.com Phil Goetz

    Phil: there’s no general procedure which will take a graph and find a clique with size within any fraction of the maximum. That is, MAXCLIQUE does not have a fully polynomial time approximation scheme (for proof see the first theorem and exercise in these lecture notes). However, there are probably algorithms that work well in practice on typical cases, just as for satisfiability problems.

    That’s why I said “Find a clique that is within the top 1% of clique sizes”. Not “within 1% of the top clique size.”

  • http://shagbark.livejournal.com Phil Goetz

    Vladimir wrote: “That’s not prisoner’s dilemma. This commitment would work only if both sides primarily want to be “taken seriously” (which is false), and if I know in advance that you’d make this kind of commitment (which I don’t). Instead of worrying about professing your declarative beliefs, you should want to understand the actual truth, whatever that is, and however you can figure it out.”

    I think it’s PD. I would benefit in the short-term from updating and improving my beliefs on this specific issue. But I’d lose in the long-term if we fall into a pattern where I listen to you and you don’t listen to me. (There is a payoff in having your beliefs listened to. I could put forward arguments as to why; but I think it’s clearly true that people act as if they preferred being listened to, over not being listened to.)

    It’s exactly tit-for-tat. I don’t see the difficulty in identifying it as such.

  • samantha

    I disagree with the premise that if we are wise in how we create and train a “god” or AGI then it will continue to act kindly toward us. I think that gives to much power to our limited abilities and far too little to an ever increasing Intelligence. In my opinion our only hope is that our kindly treatment or ethical behavior of the AGI/god in general is what growing intelligence arrives at or supports naturally. If our safety depends on constraining future development of Mind then we have no real safety. Either an ethics of maximizing the true well being of other intelligences is part of a rational evolving ethical system or it is not. If it is not then an attempt to impose it on an AGI is an attempt to imposed contra-reality, that is to say irrational, restrictions on another, and vastly more powerful mind. It would be immoral.

  • http://www.weidai.com Wei Dai

    Michael, thanks for the link. From that paper I found Steve’s more technical paper, “The Nature of Self-Improving Artificial Intelligence” and I’ve posted a comment on it.

    Besides my comments on the paper itself, I think your interpretation of his results, namely “Something very much like expected utility maximization of some very complicated utility function seems to emerge from the projected development of a wide space of minds,” doesn’t seem quite right. What Steve actually tried to show is that in a trading environment, an intelligence has to follow expected utility maximization if it wants to avoid pricing vulnerabilities that its trading partners can exploit to make profits at its expense. Even if he succeeded in doing that, commercial competition is surely just a small class of “projected development”.

  • michael vassar

    Wei: I think that the problem is that reality presents ‘trades’, e.g. options, continuously. Agents continuously need to expend resources to take action X instead of action Y, but if their decision system is such that they also spend resources to take action Y instead of X they will deplete all their resources.

  • Anonymous Coward

    Replies to some points:

    The contrast between these two views of our heritage seems hard to overstate. One is a dry account of small individuals whose abilities, beliefs, and values are set by a vast historical machine of impersonal competitive forces, while the other is a grand inspiring saga of absolute good or evil hanging on the wisdom of a few mythic heroes who use their raw genius and either love or indifference to make a God who makes a universe in the image of their feelings. How does one begin to compare such starkly different visions?

    History provides a useful guide. People like Hippocrates, Qin Shi Huang, Charles Darwin, or more recently, Linus Torvalds or Bram Cohen, made significant alterations to the course of human history because of their personal views and technical expertise – be it in developing methodologies, standardising armies and languages, or providing computational tools with the power to effect how societies work. (international communications, education to some extent now depend on linux; bittorrent, within a year or two, came to represent 1/3 of all digital international communications – not a FOOM, but not entirely dissimilar either). These were not social changes that were ‘bound to happen’… look at Galileo. Rather, a single person imposed a new reality upon humanity, generally, through their singlemindedness and skills.

    “Paperclipping the solar system is an evil beyond the understanding of most human minds. But for a paperclipper it’s completely natural – virtuous even.”

    Heaven forbid an AI notices that humans are busy turning planet earth into more humans at a terrifying rate – and the rest of the universe, if we get the chance.

    Also, a note for anyone interested in FOOM. There’s a scifi book series by Jack Chalker called Rings of the Master. It’s about an AI that FOOMs and takes over humanity, followed by the galaxy. It proceeds to take steps to ensure humanity cannot regain control again. However, the people who created the AI took a precaution against this event; its value system includes an overriding imperative that there must always be SOME way for humans to regain direct control over the AIs actions – perhaps incredibly difficult, but within the realms of possibility.

    • Peter Jones

      Wallace would have been Darwin if Darwin hadn’t been Darwin. Someone would have been Linus too — the idea of adding an OS kernel to Gnu is too obvious.

  • Tim Tyler

    Re: Wei Dai’s “exploitable circularity can be avoided by changing preferences”

    What do you mean? If your preferences change over time, then that can be represented by a more complex preference system that explains such temporal variation in values – and then it is the utility associated with those preferences which is being maximised.

  • http://profile.typepad.com/sentience Eliezer Yudkowsky

    Wei, there’s no rule which says that the AI running CEV can’t already contain some aspects of human morality. In fact, you have to do this just to make sure that a “look but don’t touch” instruction understands what not to touch. A CEV-AI is best thought as a partially Friendly and highly conservative AI that does know how to run CEV to generate a fully Friendly AI that doesn’t have to be so conservative.

    I suppose that if I could know that my reflective equilibrium left out all aspects of morality except X, Y, Z which were simple enough to program into an AI, that this would simplify the Friendly AI problem. I’m not assuming any such simplification exists; my current morality, extended outward, neither simplifies in this fashion nor looks like it might do so. The utility function is not up for grabs – you just have to come to terms with that fact; if you can’t do complicated things like utility function transfers, you probably shouldn’t be running AIs. Utility function transfers are simpler than all of human morality, though – or you can bootstrap them – that’s the key hope here.

  • http://www.weidai.com Wei Dai

    Tim, I gave one example in my blog post already. Here’s another one, stated in terms of time-dependent preferences instead of changing preferences. Suppose I have the following preferences: A1>B1, B1>C1, C1>A1, C2>A2, C2>B2, A2>B2. A1 means state A at time 1, I start at A0, and it takes 1 time unit to go from one state to another. There’s clearly a circularity in my preferences, but it’s not exploitable. I simply go to state C at time 1 and stay there at time 2.

    To sum up why Steve Omohundro’s derivation doesn’t work: being an expected utility maximizer is sufficient, but it’s not necessary, to avoid being exploited.

  • Tim Tyler

    Firstly, there’s nothing intrinsically wrong with preferences that cause you to move in a circle. It’s quite possible to code (if A:B,if B:C,if C:A) into a utilitarian system.

    Secondly, the problem with circular preferences is not that they lead you in a circle. The problem arises when you move in a circle and you would have been better off if you had stayed still.

    Your example doesn’t result in circular travel. It doesn’t even lead to different behaviour to that which would be produced by a utility maximiser.

    In summary, I don’t see that your picking at the edges of this concept and hoping it will unravel is getting you very far.

  • http://www.mylarstoreonline.com Lisa Marie

    I don’t think ‘average’ is a fair approximation of reflective equilibria. If you think it shitty, and everyone else thinks it shitty, then the FAI (if it work as intended) will figure out that that is NOT the right answer, and do something as un-shitty as superhumanly possible.

  • Pingback: Overcoming Bias : Distrusting Drama

  • Brevon Davis

    Thank you. You have know idea how much this article impacted me.