Tegmark’s Book of Foom

Max Tegmark says his new book, Life 3.0, is about what happens when life can design not just its software, as humans have done in Life 2.0, but also its hardware:

Life 1.0 (biological stage) evolves its hardware and software
Life 2.0 (cultural stage) evolves its hardware, designs much of its software
Life 3.0 (technological stage): designs its hardware and software ..
Many AI researchers think that Life 3.0 may arrive during the coming century, perhaps even during our lifetime, spawned by progress in AI. What will happen, and what will this mean for us? That’s the topic of this book. (29-30)

Actually, its not. The book says little about redesigning hardware. While it says interesting things on many topics, its core is on a future “singularity” where AI systems quickly redesign their own software. (A scenario sometimes called “foom”.)

The book starts out with a 19 page fictional “scenario where humans use superintelligence to take over the world.” A small team, apparently seen as unthreatening by the world, somehow knows how to “launch” a “recursive self-improvement” in a system focused on “one particular task: programming AI Systems.” While initially “subhuman”, within five hours it redesigns its software four times and becomes superhuman at its core task, and so “could also teach itself all other humans skills.”

After five more hours and redesigns it can make money by doing half of the tasks at Amazon Mechanical Turk acceptably well. And it does this without having access to vast amounts of hardware or to large datasets of previous performance on such tasks. Within three days it can read and write like humans, and create world class animated movies to make more money. Over the next few months it goes on to take over the news media, education, world opinion, and then the world. It could have taken over much faster, except that its human controllers were careful to maintain control. During this time, no other team on Earth is remotely close to being able to do this.

Tegmark later explains:

As [computer ability] keeps rising, it may one day reach a tipping point, triggering dramatic change. This critical sea level is the one corresponding to machines becoming able to perform AI design. Before .. rise is caused by humans improving machines, afterward .. driven by machines improving machines. .. This is the fascinating and controversial idea of the singularity. .. I like to think of the critical intelligence threshold required for AI design as the threshold for universal intelligence: given enough time and resources, it can make itself able to accomplish any goals as well as any other intelligence entity. (p54)

I suspect that there are simpler ways to build human-level thinking machines than the solution evolution came up with. (156)

Tegmark apparently believes (like Eliezer Yudkowsky and Nick Bostrom) that there is some single yet-to-be-discovered simple general software architecture or algorithm which enables a computer to quickly get vastly better at improving its abstract “intelligence” (or “betterness”), without directly improving its ability to do most particular tasks. Then after it is abstractly better, if it so chooses it needs only modest data and hardware to quickly learn to create a skilled system for any particular task.

If humans learned this way, we’d spend the first part of our life getting “smart” by learning critical thinking and other ways to think in general, without learning much about anything in particular. And then after that we’d apply our brilliance to learning particular useful skills. In fact, human students mostly learn specific skills, and are famously bad at transferring their learning to related contexts. Students even find it hard to transfer from a lecture to a slightly different spoon-fed exam question on the same subject a month later, with strong incentives. So Tegmark foresees an AI far better at generalizing than are humans.

Tegmark is worried, because we don’t know when a singularity it might come, and it might be soon. Before then, we must find rigorous general solutions to most of ethics, computer security, decision and game theory, and the meaning of life:

What if your AI’s goals evolve as it gets smarter? How are you going to guarantee that it remains your goals no matter how much recursive self-improvement it undergoes? (263) Humans undergo significant increases as they grow up, but don’t always restrain their childhood goals. … There may even be hints that the propensity to change goals in response to new experiences and insights increases rather than decreases with intelligence. (267) .. Perhaps there’s a way of designing a self-improving AI that’s guaranteed to retain human-friendly goals forever, but I think it’s fair to say that we don’t yet know how to build one – or even whether it’s possible. (268) ..

We’ve now explored how to get machines to learn, adopt and retrain our goals. But .. should one person or group get to decide .. or does there exist some sort of consensus goals that form a good compromise for humanity as a whole? In my opinion both this ethical problem and the goal-alignment problem are crucial ones that need to be solved before any superintelligence is developed. (269)

To program a friendly AI, we need to capture the meaning of life. What’s “meaning”? What’s “life”? What’s the ultimate ethical imperative? .. If we cede control to a superintelligence before answering these questions rigorously, the answer it comes up with is unlikely to involve us. This makes it timely to rekindle the classic debates of philosophy and ethics, and adds a new urgency to the conversation! (279)

Now so far in history technology has mostly increased gradually, without huge surprising leaps, and teams usually had only modest leads on other teams. It has usually made sense to wait until seeing concrete problems before working to prevent variations on them. Tegmark admits that this applies to technology in general, and to the AI we’ve seen so far. But he sees future AI as different:

From my vantage point, I’ve instead been seeing fairly steady progress [in AI] for a long time. (92)

Throughout human history, we’ve relied on the same tried-and-true approach to keeping our technology beneficial: learning from mistakes. We invented fire, repeatedly messing up, and then invented the fire extinguisher, fire exit, fire alarm, and fire department. ..

Up until now, our technologies have typically caused sufficiently few and limited accidents for their harm to be outweighed by their benefits. As we inexorably develop ever more powerful technology, however, we’ll inevitably reach a point where even a single accident could be devastating enough to outweigh all the benefits. Some argue that accidental global nuclear war would constitute such an example. ..

As technology grows more powerful, we should rely less on the trial-and-error approach to safety engineering. In other words, we should become more proactive than reactive, investing in safety research aimed at preventing accidents from happening even once. This is why society invests more in nuclear-reactor safety than mousetrap safety. (93-94)

I’m not at all convinced that we see a general trend toward needing more proactive, relative to reactive, efforts. Nuclear power seems a particularly bad example, as we have arguably killed it due to excessive proactive regulation. And even if there were a general trend, Tegmark is arguing that future AI is a huge outlier, in that it could kill everyone forever the first time anything goes wrong.

In a 350 page book, you might think that Tegmark would take great pains to argue in detail for why we should believe that future artificial intelligence will be so different not only from past AI and from most other tech, but also from human intelligence. Why should we believe that one small team might soon find (and keep secret) a simple general software architecture or algorithm enabling a computer to get vastly better at improving its “general intelligence”, a feature not initially tied to being able to do specific things well, but which can later be applied to creating smart machines to do any specific desired task, even when task-specific data and hardware are quite limited? The closest historical analogy might be when humans first acquired general abilities to talk, reason, and copy behaviors, which has enabled us to slowly accumulate culture and tech. But even those didn’t appear suddenly, and it has taken humans roughly a million years to take over.

Here are the core of Tegmark’s arguments:

We’ve now explored a range of intelligence explosion scenarios. .. All these scenarios have two features in common:
1. A fast takeoff: the transition from subhuman to vastly superhuman intelligence occurs in a matter of days, not decades.
2. A unipolar outcome: the result is a single entity controlling Earth.
There is a major controversy about whether these two features are likely or unlikely. .. Let’s therefore devote the rest of this chapter to exploring scenarios with slower takeoffs, multipolar outcomes, cyborgs, and uploads. ..
A fast takeoff can facilitate a unipolar outcome. .. A decisive strategic advantage .. before anyone else had time to copy their technology and seriously compete. .. If takeoff had dragged on for decades .. then other companies would have been able to catch up. (150)

History revels an overall trend toward ever more coordination over ever-larger distances, which is easy to understand: new transportation technology makes coordination more valuable and new communication technology makes coordination easier. .. Transportation and communication technology will obviously continue to improve dramatically, so a natural expectation is that the historical trend will continue, with new hierarchical levels coordinating over ever-larger distances. (152-3)

Some leading thinkers guess that the first human-level AGI will be an upload. .. This is currently a minority view among AI researchers and neuroscientists. .. Why should the simplest path to a new technology be the one that evolution came up with, constrained by the requirements that it be self-assembling, self-repairing, and self-reproducing? Evolution optimizes strongly for energy efficiency. .. I suspect that there are simpler ways to build human-level thinking machines than the solution evolution came up with. (156)

Some people have told me that they’re sure that this or that won’t happen. However, I think it’s wise to be humble at this stage and acknowledge how little we know. (157)

Why should the power balance between multiple superinteligences remain stable for millennia, rather than the AIs merging or the smartest one taking over. (166)

A situation where there is more than one superintelligence AI, enslaved and controlled by competing humans, might prove rather unstable and short-lived. It could tempt whoever thinks they have the more powerful AI to launch a first strike resulting in and awful war, ending in a single enslaved god remaining. (180)

Got that?

  1. He suspects a much better than human mind design is relatively simple and easy to find.
  2. If one team suddenly found a way to grow much faster, and kept it secret long enough, no other team could catch up.
  3. History has seen a trend of coordination at increasingly larger scales,
  4. A world of multiple AIs seems intuitively to him generically unstable to becoming one AI.
  5. Those of us who disagree with him should admit we can’t be very confident here.

And that is why we probably face extinction unless we can quickly find rigorous general solutions to ethics, computer security, etc.

Pardon me if I’m underwhelmed. It’s hard to see why we should put much weight on his suspicions that there are simple powerful findable AI designs, or that multiple AIs naturally become one AI. That just isn’t what we’ve usually seen in related systems so far. The fact that one team could stay ahead if it found a fast enough way to grow says little about how likely is that premise. And a slow historical trend toward increasing coordination hardly implies that one AI will quickly arise and take over the world.

That’s my critique of the book’s main point. Let me now comment on some side points.

Tegmark is proud of getting “over 3000 AI researchers and robotics researchers” and “over 17,000 others .. including Steven Hawking” to sign a letter saying:

AI weapons .. require no costly or hard-to-obtain raw materials, so they’ll become ubiquitous and cheap. .. It will only be a matter of time until they appear on the black market and in the hands of terrorists, dictators .., warlords. Autonomous weapons are ideal for tasks such as assassinations, destabilizing nations, subduing populations, and selectively killing a particular ethnic group. We therefore believe that a military AI arms race would not be beneficial to humanity. (114)

He talks about pushing for international regulations to ban killer robots, but these arguments seem to apply generally to all offensive military tech. Given how few military techs ever get banned, we need such a tech to be both unusually harmful and an unusually easy place to enforce a ban. It isn’t at all clear to me that military robots meet this test. (More skepticism here.)

Tegmark seems here to suggest that evolution no longer applies to humans, as brains can overrule genes:

Our brains are way smarter than our genes, and now that we understand the goal of our genes (replication), we find it rather banal and easy to ignore. People might realize why their genes make the feel lust, yet have little desire to raise fifteen children, and therefore choose to hack their genetic programming by coming the emotional rewards of intimacy with birth control. .. Our human gene pool has this far survived just fine despite our crafty and rebellious brains. .. The ultimate authority is now our feelings, not our genes. (256)

This seems to ignore the fact that evolution will continue, and over a longer run evolution can select out those who tend more to rebel against genetic priorities.

Tegmark is fond of using a basic income to deal with AI’s taking human jobs:

The simplest solution is basic income, where every person receives a monthly payment with no preconditions or requirements whatsoever.

He doesn’t consider what level of government would implement this, and whether that level has sufficient access to global assets valuable after AIs dominate jobs. Others have warned that this “insurance” isn’t very well targeted to this particular risk, and I’ve warned that insurance against this risk needs strong access to global assets or reinsurance.

Finally, Tegmark suggests that we should be unhappy with, and not accept, having the same relation to our descendants as our ancestors have had with us:

Consider .. viewing the AI as our descendants rather than our conquerors. .. Parents with a child smarter than them, who learns from them and accomplishes what they could only dream of, are likely happy and proud even if they know they can’t live to see it all. .. Humans living side by side with superior robots may also pose social challenges .. The descendant and conqueror scenarios .. are actually remarkably similar. .. The only difference lies in how the last human generations are treated. ..

We may think that those cute robot-children internalized our values and will forge the society of our drams once we’ve passed on, but can we be sure that they aren’t merely tricking us? What if they’re just playing along? .. We know that all our human affectations are easy to hack. .. Could any guarantees about the future behavior of the AIs, after humans are gone, make you feel good about the descendants scenario? (188-190)

My children have definitely not assured me that they aren’t just pretending to forge the society of my dreams, and now that I’m reminded of this fact, I’m going to demand that they rigorously prove their absolute loyalty to my values, or else .. something.

GD Star Rating
loading...
Tagged as: , ,
Trackback URL:
  • lump1

    I basically agree with your criticisms of fooming. I think they are an important contribution to this seemingly growing debate. But I do worry about unipolarity developing even without foom, because the way that I picture it, what supersmart AIs will be especially good at is hacking. And when you’re the greatest hacking AI, even your slightly lesser rivals still fall pretty quickly. The connectivity of computers makes this a winner-take-all scenario.

    If I wrote a fictional AI scare-story, it would begin with some hackers hooking together all the hacking tools currently available, along with many up-to-date analysis and optimization tools. Then graft these “appendages” to a reward function that leads the program to take over networked devices and use the hardware to “evolve” through off-the-shelf neural algos with back-propagation. This could produce enough variations that though many instances would die, some offspring programs could stay ahead of the virus hunters. A ledger of variations tried, both successful and not, could be kept in a blockchain. This initially dumb AI could provide get just enough evolutionary pressure that its progress would be fast and its methods would get subtler – less brute hacking and more social engineering. It might battle humanity to a stalemate, and in the process, get really subtle about how to get people to keep it propagating and generating variations. It might even start buying hardware on ebay and hiring contractors to come install it. I’m not saying it’s foom, but now that an AI is better at go than the best human player, we can’t be far from an AI that can hack better than the worlds best hackers. And I think the latter will be a much more consequential achievement.

    • http://overcomingbias.com RobinHanson

      We are already in a world where people can hack, and computers have high connectivity. So by your argument we should already have one power ruling the world.

      • lump1

        With hacking it’s still a multipolar world because humans are still undisputed hacking champs. Go was multipolar until about a year ago for much the same reason. Imagine if we had a society that settles its economically and socially significant disputes through Go showdowns. Google would now be the undisputed Leviathan.

        Obviously we could change the custom and block this, but we can’t take computers offline. The hacker counterpart of Alpha Go could consolidate a huge first mover advantage by infecting everything before a worthy competitor arises and then sabotaging the rise of such a competitor. We would eventually mount an organized resistance, but compromised and weakened by the initial attack, it’s conceivable that the AI could mutate fast enough to forever stay a step ahead. Alternately, we and the AI might enter a truce where we verifiably cease all attempts to disarm it in exchange for assurances that it won’t mess with powergrids, railroad junctions and autopilots.

      • http://overcomingbias.com RobinHanson

        Are you suggesting that anytime a computer is the best in the world at some task, one computer will be far better than all other computers at that task, so far better that if relative control of the world depended on that task ability, one computer could control the world?

      • Sortweiss

        Seems to me like his argument is that no, it’s not a universal truth that One Computer will always Rule Them All at any given $task, but that this IS likely in the case of winner-take-all contests like any sort of direct “combat” (here hacking); the Go example maybe just shows that “nah Go has always been multipolar even though we have automated players already, so don’t worry” was an easy argument to make in our hypothetical GoWorld *until* suddenly AlphaGo.

      • lump1

        I was just writing something similar when I saw Sortweiss beat me to it. That is what I was saying. I will add that what’s special about the best AI in a virtual combat task is that it can tirelessly attack both broadly and deeply. Hackers typing on a keyboard – even the best ones – lack that kind of parallelism and stamina. And if it is a contest of AI v AI, the one with a small advantage can swamp the other through the rapid iteration of clashes. Here I’m reminded of Libratus, who isn’t much better at poker than the best humans, but when you sum over the results of thousands of hands, it effectively delivers a spanking.

      • http://overcomingbias.com RobinHanson

        So not it is all contests of any sort between entities that don’t tire that lead to one side completely winning? That proves way too much. Nations, cities, and firms are such entities.

      • John Smith

        But wouldn’t the first AGI have a significant first-mover advantage, since (nearly) by definition it’s the first that can recursively self-improve?

      • Dave Lindbergh

        @John Smith: Maybe I’m still misunderstanding Robin’s position, but if I’m not:

        1 – Robin thinks we are very far from building an AGI.

        He thinks recent progress in AI is limited to narrow special purpose systems for which they’ve been extensively trained (AlphaGo, image classifiers, etc.), and that we’re not yet close to building general purpose AIs.

        And he thinks that if/when we eventually get close to building an AGI, progress will not come suddenly with a breakthru but gradually, in fits and starts, with strength in some areas matched by severe weakness in other areas. This will allow multiple teams to catch up and compete, and permit people to learn how to control goals over an extended period of time. Therefore this won’t lead to a runaway AI explosion and a singleton.

        Which, if correct, means the worry about foom is, at best, very premature.

        2 – He doesn’t buy the recursive self-improvement argument.

        This point I don’t fully understand.

        I find it very plausible that eventually (even if hundreds of years from now), an AI system will be able to, minimally, improve its own code (if only to make it faster).

        And that will lead to a series of ever-more-powerful software versions. Whether these are “smarter” in an IQ or G sense or merely faster isn’t important. (Per the arguments in Age of Em, a sufficiently fast ordinary intelligence is functionally equivalent to a “smarter” intelligence).

        Eventually such an AI will be capable of earning money, and so (if allowed to by one of the many teams developing them) funding its own further expansion – hardware and resources as needed.

        Recurse.

        Exactly which part of this argument Robin disagrees with (if any), I don’t understand.

        (I hope Robin will correct my misunderstandings.)

      • Joe

        On point 2, I can describe my interpretation of Robin’s argument — whether he would agree with that interpretation is another matter.

        Following I.J. Good’s example, we could imagine someone at the dawn of the industrial era defining an “ultraproductive factory” as “a factory that can produce any economic good”. Since ultraproductive factories are themselves made entirely from economic goods, an ultraproductive factory can therefore be used to produce a second ultraproductive factory, which can produce two more, then those four can produce eight, then sixteen, and so on. Clearly, whichever industrialist builds the first ultraproductive factory will quickly explode in capability to take over the world.

        What is the problem with this argument? After all, given the definition of an ultraproductive factory, it’s tautologically correct. The main problem, surely, is in assuming the relevance of this definition — of taking ‘fully general factories’ as even remotely resembling how things are produced in an industrial economy.

        It’s probably theoretically possible to create a factory in which you create everything you need. Every time you need an intermediate product, you either create it from raw materials, or you determine what products you need to create to create it, and create those by recursively following the same process. The main impediment is that this is utterly totally uncompetitive with an industrial economy in which all the factories (or production mechanisms more generally) are specialised to making a small subset of everything rather than everything.

        Under this more realistic paradigm, yes something is recursively self-improving, but that something is “the world as a whole” and not any individual entity.

        Returning to intelligence rather than factories, whether the ultra-X model is any more relevant here depends on whether intelligence is an enormously complex system or a fairly simple process. For example, you’ve probably seen Scott Alexander’s flurry of recent posts on Predictive Processing, a model of the human brain which Scott seems to think is both correct and all-encompassing, in that it provides a simple general implementation of intelligence that doesn’t require much (or any?) context-specific detail. I would say that if he’s right, recursive self-improvement quite probably can be localised to a single AI project, and all the FOOM-related fears are absolutely correct.

        For the Hansonian model of AI progress to be right, this and all other attempts at a simple fully general learning process must be mistaken: AI must turn out to be an amalgamation of many systems and subsystems and specialisations and little improvements and optimisations, not a single algorithm that can be described in full detail in just a few lines of code.

      • Joe

        Re: point 2, I can give you my interpretation of Robin’s argument — whether he would agree with that interpretation is another matter.

        Borrowing I. J. Good’s example, we can imagine somebody at the dawn of the industrial era defining an “ultraproductive factory” as “a factory that can produce any economic good”. Since ultraproductive factories are themselves entirely made from economic goods, an ultraproductive factory can be used to produce another ultraproductive factory, which together can produce two more, then those four can produce eight, then sixteen, and so on. Clearly, whichever industrialist builds the first ultraproductive factory will quickly explode in capability to take over the world.

        What is the problem with this argument? After all, given the definition of an ultraproductive factory, it’s tautologically correct. (And exponential growth in productivity is a real phenomenon!) The main problem, surely, is in assuming the relevance of the definition given — of taking ‘fully general factories’ as even remotely resembling how things are produced in an industrial economy.

        It’s probably theoretically possible to create a factory to fabricate everything by yourself. Every time you need an intermediate product, you either create it from raw materials, or you determine what products you need in order to create it, and create those by recursively following the same process. The main impediment is that this is totally utterly uncompetitive with an industrial economy in which all the factories (production mechanisms more generally) are specialised to making a small subset of everything rather than everything.

        In a more realistic paradigm, yes something is recursively self-improving, but that something is “the world as a whole” and not any individual entity.

        Returning to intelligence rather than factories, whether the ultra-X model is any more relevant here depends on whether intelligence is an enormously complex system or a fairly simple process. For example, you’ve perhaps read Scott Alexander’s flurry of recent posts on Predictive Processing, a model of the human brain which Scott seems to believe is both correct and all-encompassing, in that it provides a simple general implementation of intelligence that requires almost no context-specific detail or optimisations. I would say that if he’s right, recursive self-improvement quite probably can be localised to a single AI project, and all the FOOM-related fears are totally correct.

        For the Hansonian model of AI progress to be right, this and all other attempts at a simple fully general learning machine must be mistaken: AI must turn out to be an amalgamation of many systems and subsystems and specialisations and little improvements and optimisations, not a single algorithm that can be described in full detail in just a few lines of code.

      • Joe

        I think turn-based strategy games, in which the resources available to each side are perfectly equalised, all work must be funnelled through a single move per turn, and the outcome of one game has no bearing on one’s ability to compete in further games, are an especially bad analogy of real-world combat scenarios. And combat scenarios are themselves unusually winner-takes-all compared to regular economic competition.

    • Max Tegmark

      Thanks Robin for taking the time to read my book and offer this detailed critique! Here are some comments (with your remarks in quotes).

      “Tegmark apparently believes (like Eliezer Yudkowsky and Nick Bostrom) that there is some single yet-to-be-discovered simple general software architecture or algorithm which enables a computer to quickly get vastly better at improving its abstract “intelligence” (or “betterness”), without directly improving its ability to do most particular tasks.”
      First of all, my job as a scientist isn’t to believe, but to make assessments based on evidence. I certainly don’t *believe* that there’ll be a FOOM, or any other specific outcome. Rather, I structured the book as a survey including a broad spectrum of possible outcomes that have been considered in the literature, including booth FOOM-scenarios and ones inspired by your thought-provoking book “Age of Em”. Is your critique that I give too much space to FOOM scenarios, or are you arguing that they’re so extremely implausible that I shouldn’t have mentioned them at all?

      The rest of your post sets up a series of strawman positions to shoot down and, for reasons that I don’t understand, attributes these positions to me. For example:

      * I’m not assuming that there exists some single abstract task-independent “intelligence” (or “betterness”); indeed, I argue against such a one-dimensional notion of intelligence in Chapter 1, so I’m unclear about what in my book you’re referring to here.

      * “And that is why we probably face extinction unless we can quickly find rigorous general solutions to ethics, computer security, etc.”
      Nowhere in the book do I unqualifiedly claim that we “probably face extinction” unless we can find such solutions. I explore a broad spectrum of thought experiments, from hopeful to gloomy, and go to great lengths not to jump to conclusions about what will or should happen.

      Regarding lethal autonomous weapons, I find it interesting that you disagree with most AI researchers I know, but I respect your point of view. The way I see it, the central argument is an economic one: if we lower the price of anonymous assassination to near zero, there’ll be a lot more of it.

      * “Tegmark seems here to suggest that evolution no longer applies to humans, as brains can overrule genes”
      I suggest no such thing, merely that cultural evolution is currently faster than biological evolution.

      * Tegmark suggests that we should be unhappy with, and not accept, having the same relation to our descendants as our ancestors have had with us”
      I suggest no such thing. Descendants is simply one of many scenarios I describe, including possible pros and cons. I deliberately don’t take sides, but invite the reader to think about what she prefers.

      “The book says little about redesigning hardware.’
      What about Chapters 4, 5 & 6?

      ” it does this withouthaving access to vast amounts of hardware or to large datasets of previous performance on such tasks.”
      In the story, it *did* have access to vast amounts of hardware and to a large fraction of the internet, preloaded.

      I hope you find these clarifications helpful!

      • http://overcomingbias.com RobinHanson

        Max. I’m sorry if you think I misrepresented you, and thank you for responding here to clarify.

        I meant “believe” in a revealed preference sense of best explanation of your statements and emphasis. You start the book with a long detailed foom scenario, and you discuss unipolar scenarios much more than you do multipolar ones.

        Also, in a multipolar world, each superinteligence has only a small fraction of the world’s power. In that context there is little global risk from any one of them getting out of control of its creator or owner. The rest of the world limits its damage. It is in a unipolar world, where one superintelligence controls all, that a failure to keep control or align values of a particular superintelligence becomes a enormous risk. As a result, the following quotes of yours make a lot more sense coming from a person who sees foom as more likely than not:

        “even a single accident could be devastating enough to outweigh all the benefits.”

        “both this ethical problem and the goal-alignment problem are crucial ones that need to be solved before any superintelligence is developed.”

        “If we cede control to a superintelligence before answering these questions rigorously, the answer it comes up with is unlikely to involve us.”

        That last quote is also a basis for my “probably face extinction” attribution. (I just left on a trip and left my copy of your book home. If you ask I’ll look for more such quotes when I return. I’ll also check re hardware in the initial story)

        I don’t object to your considering the foom scenario, but you do seem to think it quite likely, and more likely than I do, and I wrote this post to try to engage your supporting arguments.

        You object to my saying that you think that “there exists some single abstract task-independent `intelligence’”. But your initial fictional scenario says:

        “Although its cognitive abilities still lagged far behind those of humans in many areas for example social skills, [they] had pushed hard to make it extraordinary at one particular task: programming AI Systems. .. if they could get this recursive self-improvement going, the machine would soon get smart enough that it could also teach itself all other humans skills that would be useful.”

        That “one particular” ability sure sounds to me like a “single abstract task-independent `intelligence’” enabling high performance on all tasks.

        Your saying that “The ultimate authority is now our feelings, not our genes.” was a basis for my saying you “seem to suggest that evolution no longer applies to humans, as brains can overrule genes.” If genetic evolution kills off uncooperative feelings, how do feelings remain an ultimate authority relative to genes?

        I read your tone in your descendants section as dissatisfied with the risks of that scenario; I’ll leave it to other readers to say how they read the tone.

  • http://akarlin.com/ akarlin

    Sounds very derivative, basically Singularity is Near + Superintelligence. (I say that as someone who enjoyed Our Mathematical Universe).

    And this whole Life 1.0/2.0/3.0 thing looks ridiculous (more ridiculous than usual for this overused trope) seeing as it is just a repackaging of Chardin’s and Vernadsky’s century-old ideas.

    • Max Tegmark

      I’m very curious about whether you’d still find the book to contain little new beyond Singularity is Near + Superintelligence if you read the book rather than merely Robin’s review. In fact, I’m so confident that you’ll change your mind on this point if you read the book, that I’ll offer you a full refund via PayPal if you ask me for it! 🙂

      • http://overcomingbias.com RobinHanson

        I agree with Max that you’ll find many other things in his book that aren’t in either of those other two books.

  • Pingback: Tegmark and Social Control - AISafety.com

  • https://llordoftherealm.wordpress.com/ Lord

    As long as our sim selves dwell in a sim world, we shouldn’t worry too much about them having a sim nuclear war along the way as long as we can eavesdrop on their development and bring back whatever proves useful, if we can distinguish those from what is not. It would require quite a lot of tweaking to keep the worlds from diverging due to knowledge limitations but it would provide tests of our knowledge.

  • Michael Foody

    First of all, I don’t know what I’m talking about…but

    Even existing AIs are better than human beings at a wide variety of tasks. I can imagine one of these specific task AIs being sufficiently good at its task as to be capable of solving a wide variety of important problems through the specific problem. What’s more not every problem is strategically relevant to maintaining AI hegemony or advancing the AI’s interests as it defines them. A sufficiently talented propaganda AI could probably persuade enough people to make conditions for a competitive AI incredibly unlikely for example. Human level intelligences have persuaded millions of people to kill and die for all manner of causes counter to their local interests across human history, is it so hard to imagine an AI that’s a better propagandist/ mass persuader than any human achieving a decisive strategic advantage that precludes competing AIs?

    • http://overcomingbias.com RobinHanson

      I don’t have a problem with believing that a very persuasive AI could gain many advantages from that ability. What I’m doubting is what process it takes to create such an AI.

      • Michael Foody

        Hmm I think it would be pretty easy, I think you can image getting very close even with present day technology. Imagine facebook has a problem with people using it to transmit repugnant ideology, they decide to create an algorithm that privileges preferred ideology. They hire a team of sociologists to classify a sub sample of Facebook posts on the extent to which they adhere to preferred ideology and use the sort of machine learning that’s currently used in legal document reviews to extrapolate out their logic to the universe of posts promoting or demoting posts based on how well they adhere to the preferred ideology. People seeking the social approval that comes with a successful post seek to cater to the tastes of the algorithm. People who have internalized the prefer ideology become more influential. Others seeking to fit insight what they perceive as the norms of their peers also alter their behavior. Changes would even effect non users as they adapt to altered social norms. On it’s own I think that this would be very powerful and prone to unintended consequences, and though its inputs are based on human values and humans do a lot of the coprocessing through engaging with the AI, the algorithmic weighting itself is enough of a strange loop to be reasonably defined as intelligence.

        Even if I’m skeptical of defining it as intelligence I can easily imagine this or a similar system expanding in scope and ambition using available tools (AB testing arguments/advertisements at scale, observing online behavior to target arguments/advertisements at types, creating influence maps to target more influential or powerful people) or even being given space to develop novel tools for influence or evaluating success such that it evolves into something that we’d obviously call a super intelligence.

  • Will Sawin

    When you say

    > Students even find it hard to transfer from a lecture to a slightly different spoon-fed exam question on the same subject a month later, with strong incentives

    The questions are only spoon-fed from the perspective of faculty etc. who understand the subject. Doesn’t that mean that the faculty are exhibiting stronger transfer learning, to which the students’ looks weak in comparison? For instance in math if you put a professor in an undergraduate class in a subject they haven’t studied, they would likely do well with little effort, and in particular would have much less trouble with transfer learning.

    So either there is a threshold of knowledge after which transfer learning starts getting easier, or some kind of natural talent that makes transfer easier.

    • http://overcomingbias.com RobinHanson

      Follow the link I gave, and read the literature. I didn’t say there is no transfer of learning, only that it is much weaker than many people think.

  • Pingback: Recomendaciones | intelib

  • Max Tegmark

    (Apologies Robin for accidentally posting as a reply to lump1’s comment instead of to you; if you move your reply to this comment, I’ll reply here instead. Happy travels! 🙂

    Thanks Robin for taking the time to read my book and offer this detailed critique! Here are some comments (with your remarks in quotes).

    “Tegmark apparently believes (like Eliezer Yudkowsky and Nick Bostrom) that there is some single yet-to-be-discovered simple general software architecture or algorithm which enables a computer to quickly get vastly better at improving its abstract “intelligence” (or “betterness”), without directly improving its ability to do most particular tasks.”

    First of all, my job as a scientist isn’t to believe, but to make assessments based on evidence. I certainly don’t *believe* that there’ll be a FOOM, or any other specific outcome. Rather, I structured the book as a survey including a broad spectrum of possible outcomes that have been considered in the literature, including booth FOOM-scenarios and ones inspired by your thought-provoking book “Age of Em”. Is your critique that I give too much space to FOOM scenarios, or are you arguing that they’re so extremely implausible that I shouldn’t have mentioned them at all?

    The rest of your post sets up a series of strawman positions to shoot down and, for reasons that I don’t understand, attributes these positions to me. For example:

    * I’m not assuming that there exists some single abstract task-independent “intelligence” (or “betterness”); indeed, I argue against such a one-dimensional notion of intelligence in Chapter 1, so I’m unclear about what in my book you’re referring to here.

    * “And that is why we probably face extinction unless we can quickly find rigorous general solutions to ethics, computer security, etc.”

    Nowhere in the book do I unqualifiedly claim that we “probably face extinction” unless we can find such solutions. I explore a broad spectrum of thought experiments, from hopeful to gloomy, and go to great lengths not to jump to conclusions about what will or should happen.

    Regarding lethal autonomous weapons, I find it interesting that you disagree with most AI researchers I know, but I respect your point of view. The way I see it, the central argument is an economic one: if we lower the price of anonymous assassination to near zero, there’ll be a lot more of it.

    * “Tegmark seems here to suggest that evolution no longer applies to humans, as brains can overrule genes”

    I suggest no such thing, merely that cultural evolution is currently faster than biological evolution.

    * Tegmark suggests that we should be unhappy with, and not accept, having the same relation to our descendants as our ancestors have had with us”

    I suggest no such thing. Descendants is simply one of many scenarios I describe, including possible pros and cons. I deliberately don’t take sides, but invite the reader to think about what she prefers.

    “The book says little about redesigning hardware.’

    What about Chapters 4, 5 & 6?

    ” it does this withouthaving access to vast amounts of hardware or to large datasets of previous performance on such tasks.”

    In the story, it *did* have access to vast amounts of hardware and to a large fraction of the internet, preloaded.

    I hope you find these clarifications helpful!

    • http://overcomingbias.com RobinHanson

      (This comment moved from needlessly-obscure place.)

      Max. I’m sorry if you think I misrepresented you, and thank you for responding here to clarify.

      I meant “believe” in a revealed preference sense of best explanation of your statements and emphasis. You start the book with a long detailed foom scenario, and you discuss unipolar scenarios much more than you do multipolar ones.

      Also, in a multipolar world, each superinteligence has only a small fraction of the world’s power. In that context there is little global risk from any one of them getting out of control of its creator or owner. The rest of the world limits its damage. It is in a unipolar world, where one superintelligence controls all, that a failure to keep control or align values of a particular superintelligence becomes a enormous risk. As a result, the following quotes of yours make a lot more sense coming from a person who sees foom as more likely than not:

      “even a single accident could be devastating enough to outweigh all the benefits.”

      “both this ethical problem and the goal-alignment problem are crucial ones that need to be solved before any superintelligence is developed.”

      “If we cede control to a superintelligence before answering these questions rigorously, the answer it comes up with is unlikely to involve us.”

      That last quote is also a basis for my “probably face extinction” attribution. (I just left on a trip and left my copy of your book home. If you ask I’ll look for more such quotes when I return. I’ll also check re hardware in the initial story)

      I don’t object to your considering the foom scenario, but you do seem to think it quite likely, and more likely than I do, and I wrote this post to try to engage your supporting arguments.

      You object to my saying that you think that “there exists some single abstract task-independent `intelligence’”. But your initial fictional scenario says:

      “Although its cognitive abilities still lagged far behind those of humans in many areas for example social skills, [they] had pushed hard to make it extraordinary at one particular task: programming AI Systems. .. if they could get this recursive self-improvement going, the machine would soon get smart enough that it could also teach itself all other humans skills that would be useful.”

      That “one particular” ability sure sounds to me like a “single abstract task-independent `intelligence’” enabling high performance on all tasks.

      Having access to a large fraction of the internet isn’t the same as have lots of data on MTurk task performance.

      Your saying that “The ultimate authority is now our feelings, not our genes.” was a basis for my saying you “seem to suggest that evolution no longer applies to humans, as brains can overrule genes.” If genetic evolution kills off uncooperative feelings, how do feelings remain an ultimate authority relative to genes?

      I read your tone in your descendants section as dissatisfied with the risks of that scenario; I’ll leave it to other readers to say how they read the tone.

      • arch1

        Robin re: “As a result, the following quotes of yours make a lot more sense coming from a person who sees foom as more likely than not”:
        The 3 quotes which you then include don’t imply that the speaker thinks p(foom) > 0.5. Global existential risks of even low probability can reasonably be cause for great concern.

      • Paul Christiano

        > Also, in a multipolar world, each superinteligence has only a small fraction of the world’s power. In that context there is little global risk from any one of them getting out of control of its creator or owner.

        This isn’t a meaningful objection to concerns about alignment. You repeatedly say that the only reason to be concerned about alignment now is a fast local takeoff, but haven’t argued for it much. A 1-year distributed transition to machine intelligence doesn’t make the problem go away; it means we have ~ a year extra to solve it, but it’s not the case that all problems can be easily resolved in a year.

      • Paul Christiano

        For example, note that Eliezer Yudkowsky and Nick Bostrom (and I expect also Max Tegmark) think that alignment problems are *more* likely to be serious in multipolar scenarios, not less serious.

      • http://overcomingbias.com RobinHanson

        I’ll keep an open mind when I read something on that. In this post I’m reacting to reading Tegmark’s book, which doesn’t discuss that.

      • Joe

        One distinction is that in a singleton scenario, what the universe contains is determined by what the singleton power wants — if it wants a universe full of paperclips, it gets a universe full of paperclips — and so we need to make sure the thing it wants is something good. In a multipolar world, what the universe contains is much more constrained: it’s lots and lots of life. As much life as can fit. Now it might still be the case that the life isn’t sentient, or even if it is that it isn’t happy. But since ‘is alive’ is almost certainly a prerequisite for a piece of matter to have moral value, multipolarity precludes an enormous fraction of the bad outcomes that would be possible with a singleton scenario.

        If you additionally believe that sentience is an unavoidable feature of intelligent life, and that sentient life in a competitive future world is likely to be positive-utility on net, then a multipolar outcome looks pretty sunny. You probably still can’t force a multipolar outcome if singletons tend to form anyway, but you can be terrified that a misguided attempt at forming a singleton will succeed well enough to prevent a vast flourishing future civilisation from getting to exist.

      • http://overcomingbias.com RobinHanson

        I didn’t claim that foom is the only reason to be concerned about alignment in general. But the quotes I give suggest that it is in fact Tegmark’s main reason for concern, and the poll I posted on before suggests that this is also the main reason that most are concerned. The kinds of solutions that Tegmark points us to are also the kinds that make the most sense in a foom scenario.

      • Wei Dai

        I don’t think it makes sense to talk about the “main reason” to be concerned about alignment. Suppose I think that FOOM is true with .6 probability, and conditional on FOOM being false, property rights will collapse with probability .7, and conditional on property rights holding up, with probability .7 humans will end up with a very small piece of the universe and the rest isn’t very valuable. (These numbers seem roughly consistent with what Tegmark wrote. He talked about what you call “collapse” and “value drift” in the “Libertarian Utopia” and “Descendants” sections respectively.) I don’t know what you’d consider my “main reason” for concern to be in this case, but whatever it is, if you told me that it’s definitely false and I believed you, I’m still not very reassured about the need for AI alignment. If you want to reassure people, it seems like you need to address all of the disjunctive reasons for concern.

      • http://overcomingbias.com RobinHanson

        I don’t have the book at the moment to review the Libertarian Utopia section, and the Descendants section seems mostly about value deception, not drift. But even if both property loss and value drift had been mentioned somewhere in the book, it seems crazy to describe the book as giving equal emphasis to those three considerations.

      • Wei Dai

        I guess that makes sense if you’re trying to criticize how the book was written, as opposed to trying to allay the author’s actual concerns (and those of people like him). In other words, he may have just written more about foom because it’s more fun to write about or he could think of more things to say about it or he expected the other concerns to be more obvious to his readers, not because he believes it’s the only important concern.

      • http://juridicalcoherence.blogspot.com/ Stephen Diamond

        Given that Tegmark expressly rejects the estimation of likelihoods (or am I interpreting him correctly?), it particularly lacks sense to read off likelihood estimates based on pages devoted to specific subjects.

      • m

        Robin Hanson: “in a multipolar world, each superinteligence has only a small fraction of the world’s power. In that context there is little global risk from any one of them getting out of control of its creator or owner. The rest of the world limits its damage.”

        Is that a *definition* of a multipolar world? If not then what reason is there to think that a scenario where multiple superhuman AI grow in power in lockstep will not lead to a total destruction outcome? Is that based on belief 1 and/or 2 below or something else?

        (1) for any total destruction method that one superhuman AI among others can design: other superhuman AIs will prevent the use of the method from having a total destruction outcome.

        (2) for any method a superhuman AI among others can design that would destroy all competing AIs (but not itself): other superhuman AIs will have the same method *and* will reliably detect and reciprocate use of the method (MAD scenario) *and* no superhuman AI will choose total destruction via MAD.

      • http://overcomingbias.com RobinHanson

        As I’m pretty sure you can guess, I assume the usual case in history where one small part of the world cannot destroy the whole world

      • m

        That is a huge assumption. Seems the assumption should be made explicit when describing the multipolar scenario then.

        Do you know if your assumption is widely shared in AI risk circles?

        I myself have low credence in that assumption, mostly because of a general assymmetry: there are more ways to harm/destroy complex systems like human organisms than there are ways to prevent harm/destruction to them.

      • http://overcomingbias.com RobinHanson

        The assumption is common enough that I’d expect an author whose argument relied on disagreeing with it to mention that fact.

    • http://juridicalcoherence.blogspot.com/ Stephen Diamond

      First of all, my job as a scientist isn’t to believe, but to make assessments based on evidence. I certainly don’t *believe* that there’ll be a FOOM, or any other specific outcome.

      A very basic disagreement with Robin, this seems. (At least if “belief,” per the Bayesians, comes in degrees that should correspond with likelihoods.)

  • Sondre R.

    Regarding the generalizability of intelligence.

    David Deutsch’s theory of people as “universal explainers” describe part of our intelligence (the interesting part) as being universal in the sense of being able to explain anything.

    This does make sense to me. Yes, learning a specific piece of knowledge in geography doesn’t translate to knowing knowledge in chemistry. But it is true that the same intelligence that both can discover, understand and explain geography also can explain chemistry. There is clearly some generalizability of human intelligence.

    And it doesn’t have to be any harder than this, if I were to conceptualize a simple algorithm structure.

    Problem: Any problem
    What we have: A processor, memory and some mechanism of material transformation

    Step 1: Creativity
    1. Random generation within constraints (relaxation: goals, prior beliefs)
    2. Selective retaining of randomly generated ideas based on algorithms for basic sense-making

    Step 2: Critical thinking
    3. Critique new idea based on prior knowledge, remove those that doesn’t
    4. Get critique from other external sources of knowledge

    Step 3: Trial and error
    5. Test qualified idea, log whether it worked to update knowledge in a bayesian way
    6. Repeat until it works and new knowledge is created

    To me this sort of method seems like what the human mind must be running. When I encounter something I haven’t seen before, like the other day with a door without a handle, I seem to be running through something like this iteratively until I find a solution.

    This seem to be to be the same mechanism I also run for figuring out a software problem, or an economics problem, or a construction problem, or a resource collection problem. The only thing I need is more knowledge in order to solve any problem. And the only thing I need to be able to generate solutions to any problem, is an algorithm like this, a processor, memory and some sort of method for material transformation.

    • http://overcomingbias.com RobinHanson

      If you already know and can describe a general intelligence algorithm, it can’t be what is discovered by one team that allows it’s computer to grow much much faster than all the others.

      • Sondre R.

        It surely can, since it is universal.

        It just depends on the capacity of computation and memory, which is also universal.

        What limits human creativity and intelligence from growing our knowledge-stack faster is capacity. 7 billion people can create a lot more knowledge than 1 person.

        In the same way, whatever the capacity we have for computers when someone actually builds the sort of algorithm I described, will determine whether it is a big boost.

        If it happened today it would merely be a whimper, as even the strongest supercomputers aren’t as fast as 1 human brain. But if Moore’s law continues and we get to 2040s and a personal computer has equivalent capacity to all human beings combined, then such a knowledge-creating algorithm would mean so much more knowledge-creating that it in theory could get a head-start if it kept its findings a secret.

  • cwcwcwcwcw

    What I never see when I read things about the singularity is who programmed the computer to *want* to improve itself. All computers now do ONLY what you tell them to do. If you do not tell the computer to reprogram itself it will not do it.

    Having said that, I am certain that someone is right now trying to insert the desire to improve functioning into software and that in the end the world is fucked.