Friendly Projects vs. Products

I’m a big board game fan, and my favorite these days is Imperial.   Imperial looks superficially like the classic strategy-intense war game Diplomacy, but with a crucial difference:  instead of playing a nation trying to win WWI, you play a banker trying to make money from that situation.  If a nation you control (by having loaned it the most) is threatened by another nation, you might indeed fight a war, but you might instead just buy control of that nation.  This is a great way to mute conflicts in a modern economy: have conflicting groups buy shares in each other.

For projects to create new creatures, such as ems or AIs, there are two distinct friendliness issues: 

Project Friendliness  Will the race make winners and losers, and how will winners treat losers? While any race might be treated as part of a total war on several sides, usually the inequality created by the race is moderate and tolerable.  For larger inequalities, projects can explicitly join together, agree to cooperate in weaker ways such as by sharing information, or they can buy shares in each other.  Naturally arising info leaks and shared standards may also reduce inequality even without intentional cooperation.  The main reason for failure here would seem to be the sorts of distrust that plague all human cooperation.

Product Friendliness  Will the creatures cooperate with or rebel against their creators?  Folks running a project have reasonably strong incentives to avoid this problem.  Of course for the case of extremely destructive creatures the project might internalize more of the gains from cooperative creatures than they do the losses from rebellious creatures.  So there might be some grounds for wider regulation.  But the main reason for failure here would seem to be poor judgment, thinking you had your creatures more surely under control than in fact you did. 

It hasn’t been that clear to me which of these is the main concern re "friendly AI." 

Added:  Since Eliezer says product friendliness is his main concern, let me note that the main problem there is the tails of the distribution of bias among project leaders.  If all projects agreed the problem was very serious they would take near appropriate caution to isolate their creatures, test creature values, and slow creature development enough to track progress sufficiently.  Designing and advertising a solution is one approach to reducing this bias, but it need not need the best approach; perhaps institutions like prediction markets that aggregate info and congeal a believable consensus would be more effective. 

GD Star Rating
loading...
Tagged as:
Trackback URL:
  • http://profile.typekey.com/sentience/ Eliezer Yudkowsky

    The second one, he said without the tiniest trace of hesitation.

  • http://profile.typekey.com/aroneus/ Aron

    I wonder how much of the prior debate could be reused in a scenario where this was the first post.

  • http://causalityrelay.wordpress.com/ Vladimir Nesov

    Robin Hanson:

    “For projects to create new creatures, such as ems or AIs, there are two distinct friendliness issues”

    Friendliness isn’t particularly concerned with “creatures”, it’s a problem of correctly creating the singleton. Building ordinary “creatures” while planning for the possibility of (one of) them eventually growing into the singleton doesn’t seem like a good idea, since the sense in which specialized “creatures” are supposed to cooperate with their creators is much more limited than the level of meta, breadth of context, required for a good singleton; priorities for these applications are too different. If we require Rumbas of our world to be Friendly, we’ll never get anywhere. And conversely, if a sufficiently strong “creature” is not Friendly, it can use its advantage, especially in better (not just faster) intelligence, to build a creature-Friendly singleton. (This allows the case of gradual growth, with new creatures slowly biasing the overall morality, eventually converging on unFriendly dynamic.)

  • http://profile.typekey.com/SoullessAutomaton/ a soulless automaton

    I am somewhat baffled as to how one could interpret Eliezer’s writing, in whole or in part, as being about the first of those two issues. Further, the second seems to understate the severity of the issue (for reasons associated with the discussion elsewhere of recursive feedback).

    In fact, large amounts of what he’s written seem primarily aimed at persuading people that not only is the second one a significant issue at all, but that it is likely the single most important issue of the next century (at least, that’s roughly what it’s convinced me of), to the point of beating the point like a dead horse.

  • Z. M. Davis

    automaton: “I am somewhat baffled as to how one could interpret Eliezer’s writing, in whole or in part, as being about the first of those two issues.”

    Seconded, for my part.

  • GenericThinker

    “In fact, large amounts of what he’s written seem primarily aimed at persuading people that not only is the second one a significant issue at all, but that it is likely the single most important issue of the next century (at least, that’s roughly what it’s convinced me of), to the point of beating the point like a dead horse.”

    That being the case then where are the attempts at solving it rigorously and mathematically? There are none from Eliezer (the closest being Creating Friendly AI 1.0 The Analysis and Design of Benevolent Goal Architectures, but it really isn’t technical and the lack of relevant technical source and the abundance of fiction leads me to discount it essentially out of hand.) so in other words he is one who is eager to point out problems real or imagined but when it comes to solving them he has nothing. I for one believe in spending more of my time solving the problem then wasting time beating the importance of the problem to death for the sake of ego.

  • http://hanson.gmu.edu Robin Hanson

    I just added to the post.

  • Z. M. Davis

    Generic: “[W]hen it comes to solving [the alleged Friendliness problem, Yudkowsky] has nothing.”

    For the record, this is a really difficult problem. No one would disagree that it’s much better to actually have a solution than it is to only point out the problem, but when you really don’t know how to solve the problem, what can you do but point at it while you continue to think? If you have any constructive suggestions, we’re all listening–or were you just pointing out the problem with pointing-out-problems-without-having-solved-them?

  • http://don.geddis.org/ Don Geddis

    @ Generic Thinker wrote: “he is one who is eager to point out problems real or imagined but when it comes to solving them he has nothing.”

    You may enjoy the following quote, from Daniel Gilbert’s excellent book Stumbling on Happiness:

    “My friends tell me that I have a tendency to point out problems without offering solutions, but they never tell me what I should do about it.”

  • michael vassar

    Robin: I agree with your comment up until the part about prediction markets. That part makes me say “What?!?”. I am interested in prediction markets in many situations, but you just said the distant future isn’t their strong suit. Also, to an even greater degree, uncertainty of future property rights and correlations between the honoring of property rights and the values of different asset classes are REALLY not their strong suit. It seems pretty clear to me that someone who seriously believes that property rights are non-negligibly likely to hold post singularity (whatever the version of singularity) should simply purchase large amounts of the least valuable matter available for which property rights exist with a possible preference for matter containing a wide mix of elements.

  • http://reflectivedisequilibria.blogspot.com/ Carl Shulman

    “perhaps institutions like prediction markets that aggregate info and congeal a believable consensus would be more effective.”

    Payments theoretically owing to me in the event of the destruction of both debtor and creditor (along with all heirs) don’t seem like a strong incentive for committing money to a bet on that outcome.

  • Matthew Hammer

    These discussions keep getting sidetracked by secondary issues: unfriendliness of singletons, amorality of competition regimes, specific methods of risk mitigation that might work in one or the other. But these topics are subsidiary to the question of super-exponential growth vs an exponential growth regime with a higher rate of return.

    Robin has already essentially accepted super-exponential growth with his “Economic Growth as a Series of Exponential Modes” paper. But his conception doesn’t necessarily lead to AI-go-FOOM-all-by-itself.

    Exponential growth means that your returns are linear in your investment. You can think of that as a large pool of investments that all give the same return, such that your total returns are limited by how much of the pool you can employ, i.e. your current resources.

    Super-exponential growth means that your returns are greater than linear in your investment.
    (dy/dt = y*y also reaches infinity in a finite time, just more slowly than dy/dt = e^y).
    You can think of that as a pool of investments where there are higher rates of return available when you can employ larger chunks of resources in a given investment.

    Both AI-goes-FOOM and Series-of-Exponential-Modes presume multiple levels with increasingly higher rates of return. The only difference is that AI-goes-FOOM requires those modes to be close to continuous, every small increase in resources allows you to reach correspondingly greater rate of return investments. More specifically the gap must be small compared to the time it takes to co opt the rest of the world (and their resources) into the new growth mode.

    Robin appears to favor fairly large gaps between functionally discrete levels. Agriculture reached geographic boundaries long before making the next transition, as has industry. He calculates that all historical regimes experienced a number of doublings in the large single digits and I expect would argue that the relative efficiency of the modern financial system gives cause to expect that the rest of the world could be integrated into the next mode within the expected number of doublings given the expected new rate of return.

    Eliezer sees those new levels as being much much closer together, so that pausing to voluntarily co opt the rest of the world yields lower returns than just jumping up more modes locally. I’m hoping he has some communicable insight into why he expects such a structure in the space of possible investments. However I can’t attempt a paraphrase as I haven’t managed to receive that insight yet.

    Once we settle on an expectation over the structure of that investment space, we will have some indication of how much effort to direct towards worrying about unfriendly singletons verses worrying about unfriendly competition regimes. Until then, I think we should focus on that primary disagreement.

  • michael vassar

    Within Matthew’s analysis it would seem to me that Robin’s growth modes frame would strongly suggest a singleton. A two week doubling time for the new growth mode would suggest the option of growing to constitute almost all of the world economy within a year without needing to trade and share the long-term fruit of growth. Even more strikingly, there could easily be further modes which would imply still shorter doubling times. Two more transitions would suggest sub-minute economy doubling times, which would flat-out rule out trade beyond the earth-moon system due to light-speed lags would mean that insights from Mars would be trivial by the time they arrived. Even on Earth waste heat from generalized computation at this rate of reorganization would seem to make life zero-sum among entities which were not already so organized as to collectively constitute an adiabatic system.

  • Tim Tyler

    On the issue of whether there will be humans on the side of any ascendant machines – I think probably yes – provided we are talking about the fairly short term. Humans being stupid enough to let a computer virus, grey goo, a gold-atom collector – or something like that – take over the world is not a very realistic scenario. Obviously humans are not going to persist en masse in the long term. The survivors might go willingly – or they might get the push – but that’s further into the future.

    These discussions keep getting sidetracked by secondary issues: unfriendliness of singletons, amorality of competition regimes, specific methods of risk mitigation that might work in one or the other. But these topics are subsidiary to the question of super-exponential growth vs an exponential growth regime with a higher rate of return.

    Not to say that discussion of the shape of the curve is not important – but I think most parties here agree we have progress on a pretty impressive scale coming up the pike for the next 20 years. The naysayers seem to point to a slow-down of scientific discovery – which isn’t exactly the sole driver of technological development. So, I think it’s fair to consider the consequences of such progress – and to ask what we should do about it.

  • Matthew Hammer

    Michael: Nothing rules out the possibility of a singleton. It’s conceptually possible to have a singleton without any growth transitions if we merely manage to create a stable, dominant government/culture. The point is that if Eliezer is right, a singleton is the only possibility and we can safely ignore any other options. Otherwise there are a range of possibilities that need to be considered.

    Also, I want to add an abstract extension to Robin’s largely empirical analysis.

    If we recast the exponential modes with a focus on intelligence:
    The agricultural transition involved the invention of writing, which further allowed the creation of the field of mathematics. Faithful records and mathematical thinking together raised effective intelligence sufficiently to access a higher level of investment opportunities.
    Later, we have a cluster of innovations including the printing press and the scientific method. Together they allow systematic exploration of theory space and rapid transportation of those theories for independent testing. This again increases effective intelligence enough to access the next higher level.

    It seems reasonable that AGI would further avoid flaws inherent in human thinking, possess an expanded working memory, and allow more parsimonious usage of experience. Together this would allow access to yet another level. Not everyone in the world would buy that, but I expect most readers of overcomingbias would.

    Note that within each level, the effective level of intelligence is sufficient to locate additional avenues of investment of a similar nature to replace those that have been exhausted, but is not able to access higher levels until a number of improvement have been assembled.

    If we assume the amount of intellectual work involved in assembling those packages of improvements is proportional to the total production accomplished, then using the results of Robin’s paper we get:

    Writing/Math transition required 1 unit of work.
    Printing/Science transition required 194 units of work
    AGI requires at least 114105 units of work.

    This set of values is a good candidate for being distributed logarithmically. This shouldn’t come as too much of a surprise since physical and mathematical constants are so distributed (see Benford’s law) and conceptually these transition points seem like they reside on an odd interface between mathematical properties of computation, and the physical properties of the universe.

    So given that particular abstraction, that these regimes share characteristics of physical and mathematical constants and so we would expect them to be logarithmically distributed, we should expect future mode jumps to continue to be discrete on the timescale of the doubling time of the prior mode. While it’s possible that there might be a random cluster right above our current position, we presumably can’t even understand the next level at our current level of intelligence so it seems unlikely that we could know where the next several reside. At the very least we would need some hefty insights to convince us.

  • http://profile.typekey.com/sentience/ Eliezer Yudkowsky

    Robin: If all projects agreed the problem was very serious they would take near appropriate caution to isolate their creatures, test creature values, and slow creature development enough to track progress sufficiently.

    Robin, I agree this is a left-tail problem, or to be more accurate, the right tail of the left hump of a two-hump camel.

    But your suggested description of a solution is not going to work. You need something that can carry out a billion sequential self-modifications on itself without altering its terminal values, and you need exactly the right terminal values because missing or distorting a single one can spell the difference between utopia or dystopia. The former requires new math, the latter requires extremely meta thinking plus additional new math. If no one has this math, all good guys are helpless and the game is lost automatically.

    That’s why I see this as currently having the status of a math problem even more than a PR problem.

    For all the good intentions that ooze from my every pore, right now I do not, technically speaking, know how to build a Friendly AI – though thankfully, I know enough to know why “testing” isn’t a solution (context not i.i.d.) which removes me from the right tail of the left hump.

    Now, some aspects of this can be viewed as a PR problem – you want to remove researchers from the right tail of the left hump, which you can do up to a point through publicizing dangers. And you want to add researchers to the right tail of the right hump, which you can do by, among other strategies, having math geniuses read Overcoming Bias at age 15 and then waiting a bit. (Some preliminary evidence indicates that this strategy may already be working.)

    But above all, humanity is faced with a win-or-fail math problem, a challenge of pure technical knowledge stripped of all social aspects. It’s not that this is the only part of the problem. It’s just the only impossible part of the problem.

  • steven

    Is there a knockdown argument why risking dystopia is preferable to losing the game automatically?

  • steven

    On the recent hard takeoff series: I haven’t read all the posts in detail yet but the discussion leaves me unsatisfied in that it seems there should be some much simpler argument showing that hard takeoff is plausible. I’m not sure what it is that people who don’t believe in hard takeoff think will take more than, say, days; if you look at past advances it seems like the time spent on them was mostly spent on working around specifically human failings rather than on the intrinsic difficulty of the problem.

  • http://supermodelling.net derekz

    steven, there are a couple of things that give me pause when thinking about a hard takeoff scenario. Probably the most important one is exactly the time issue you think is unimportant. One order of magnitude of computing “speed” turns your days into a month. So saying days for some step (like the invention of molecular nanotechnology or ab initio reconstruction of materials science or obtaining super-mastery of psychology) is really making a very specific projection about the computational requirements of exploring the search spaces involved. I don’t see a principled reason for a particular probability distribution of FLOPS vs Time for these, so the very strong claim of “days” (strong because it is on the surface so bizarre sounding) seem rather arbitrary and made up.

    Eliezer has done a pretty good job in this post sequence of communicating the basics of his ideas about how intelligence amounts (in part) to increased ability for making good decisions in such a search. If we think of “the design of a generic searcher (optimizer)” as another thing to search for, then becoming more intelligent about searching that space has recursive effects that could greatly increase intelligence even without changing the hardware.

    But things cannot be optimized infinitely. Those search spaces are what they are. We don’t seem to know much about the structure of the search spaces for the truly important things, nor the values of the maxima hidden therein. For me, the search spaces (especially the space of searcher-designs) intuitively “feel” hard to crack — that is, I cannot see how the redesign sequence can continually rapidly spit out vast improvements, so I am skeptical of its possibility. That is of course a limit in me: the universe may or may not really be limited in line with my intuitions rather than yours!

    Although Eliezer must feel frustrated that a half million words over the last year have not appeared to have a larger effect, I know the journey has been rewarding to me. Interestingly, though, when Eliezer’s post sequence started I was “skeptical of AI danger and Friendly AI as a solution” but thought it just possible enough to be a SIAI contributor. Now, even though I feel I have learned a lot, that still describes my feelings on the subject. Maybe I am not as open to change as I would like to think. For me, more progress on Friendly AI itself would be more likely to increase my interest and support at this point than arguments about its possibility or importance, but I don’t know if that is true for many others.

    Sorry to go on for so long! Thanks for your writings Eliezer (and Robin for offering a counterpoint), I look forward these days to new OB posts quite eagerly!

  • http://hanson.gmu.edu Robin Hanson

    Michael, I didn’t say the distant future isn’t prediction markets’ strong suit. Carl, it might take more than two minutes to figure out how to best apply them to this problem, but if this problem really is important it might be worth some extra thought.

    Matthew, super-exponential growth need not imply non-linear investment returns. The gap between levels may be less important than the locality of a level’s development. Also, it seems pretty clear that writing was not the driver of the farming transition – it was more a nice side effect whose potential was realized more later.

    Eliezer, I’d like to hear more about why testing and monitoring creatures as they develop through near human levels, slowing development as needed, says nothing useful about their values as transhuman creatures. And about why it isn’t enough to convince most others that the problem is as hard as you say; in that case many others would also work to solve the problem, and would avoid inducing it until they had a solution. And hey, if you engage them there’s always a chance they’ll convince you they are right and you are wrong. Note that your social strategy, of avoiding standard credentials, is about the worst case for convincing a wide audience.

  • luzr

    “You need something that can carry out a billion sequential self-modifications on”

    Billion is likely not enough, at least if measured complexity of human brain has anything to say about it.

    “itself without altering its terminal values, and you need exactly the right terminal values because missing or distorting a single one can spell the difference between utopia or dystopia.”

    Does it mean that you expect no human-involved feedback between modifications?

    In that case, you have just found the explanation of Fermi paradox. The math you propose is unlikely to exist. It is billion times more difficult that deciding what given human is going to do next based on the nanoscan of his brain.

  • James Andrix

    luzr:
    That only solves the fermi paradox if you suppose that none of the alien world-destroying AI’s started a colonization wave.

  • luzr

    James Andrix:

    Definitely, it was meant as sort of joke.

    OTOH, maybe complementar interpretation of Fermi paradox indicates that the path leading to benevolent AI is not as narrow as Eliezer thinks (otherwise, it would have destroyed our world already).

  • Ben Jones

    Robin, have you tried http://www.waronterrortheboardgame.com/ ? Very complex but great fun. Comes with a balaclava of evil.

  • Pingback: AI Foom Debate: Post 35 – 40 | wallowinmaya