Search Results for: foom

Why Not Wait On AI Risk?

Years ago when the AI risk conversation was just starting, I was a relative skeptic, but I was part of the conversation. Since then, the conversation has become much larger, but I seem no longer part of it; it seems years since others in this convo engaged me on it.

Clearly most who write on this do not sit close to my views, though I may sit closer to most who’ve considered getting into this topic, but instead found better things to do. (Far more resources are available to support advocates than skeptics.) So yes, I may be missing something that they all get. Furthermore, I’ve admittedly only read a small fraction of the huge amount since written in this area. Even so, I feel I should periodically try again to explain my reasoning, and ask others to please help show me what I’m missing.

The future AI scenario that treats “AI” most like prior wide tech categories (e.g., “energy” or “transport”) goes as follows. AI systems are available from many competing suppliers at similar prices, and their similar abilities increase gradually over time. Abilities don’t increase faster than customers can usefully apply them. Problems are mostly dealt with as they appear, instead of anticipated far in advance. Such systems slowly displace humans on specific tasks, and are on average roughly as task specialized as humans are now. AI firms distinguish themselves via the different tasks their systems do.

The places and groups who adopt such systems first are those flexible and rich enough to afford them, and having other complementary capital. Those who invest in AI capital on average gain from their investments. Those who invested in displaced capital may lose, though over the last two decades workers at more automated jobs have not seen any average effect on their wages or number of workers. AI today is only a rather minor contribution to our economy (<5%), and it has quite a long way to go before it can make a large contribution. We today have only vague ideas of what AIs that made a much larger contribution would look like.

Today most of the ways that humans help and harm each other are via our relations. Such as: customer-supplier, employer-employee, citizen-politician, defendant-plaintiff, friend-friend, parent-child, lover-lover, victim-criminal-police-prosecutor-judge, army-army, slave-owner, and competitors. So as AIs replace humans in these roles, the main ways that AIs help and hurt humans are likely to also be via these roles.

Our usual story is that such hurt is limited by competition. For example, each army is limited by all the other armies that might oppose it. And your employer and landlord are limited in exploiting you by your option to switch to other employers and landlords. So unless AI makes such competition much less effective at limiting harms, it is hard to see how AI makes role-mediated harms worse. Sure smart AIs might be smarter than humans, but they will have other AI competitors and humans will have AI advisors. Humans don’t seem much worse off in the last few centuries due to firms and governments who are far more intelligent than individual humans taking over many roles.

AI risk folks are especially concerned with losing control over AIs. But consider, for example, an AI hired by a taxi firm to do its scheduling. If such an AI stopped scheduling passengers to be picked up where they waited and delivered to where they wanted to go, the firm would notice quickly, and could then fire and replace this AI. But what if an AI who ran such a firm became unresponsive to its investors. Or if an AI who ran an army becoming unresponsive to its oversight government? In both cases, while such investors or governments might be able to cut off some outside supplies of resources, the AI might do substantial damage before such cutoffs bled it dry.

However, our world today is well acquainted with the prospect of “coups” wherein firm or army management becomes unresponsive to its relevant owners. Not only do our usual methods usually seem sufficient to the task, we don’t see much of an externality re these problems. You try to keep your firm under control, and I try to keep mine, but I’m not especially threatened by your losing control of yours. We care a bit more about others losing control of their cars, planes, or nuclear power plants, as those might hurt bystanders. But we care much less once such others show us sufficient liability, and liability insurance, to cover our losses in these cases.

I don’t see why I should be much more worried about your losing control of your firm, or army, to an AI than to a human or group of humans. And liability insurance also seems a sufficient answer to your possibly losing control of an AI driving your car or plane. Furthermore, I don’t see why its worth putting much effort into planning how to control AIs far in advance of seeing much detail about how AIs actually do concrete tasks where loss of control matters. Knowing such detail has usually been the key to controlling past systems, and money invested now, instead of spent on analysis now, gives us far more money to spend on analysis later.

All of the above has been based on assuming that AI will be similar to past techs in how it diffuses and advances. Some say that AI might be different, just because, hey, anything might be different. Others, like my ex-co-blogger Eliezer Yudkowsky, and Nick Bostrom in his book Superintelligence, say more about why they expect advances at the scope of AGI to be far more lumpy than we’ve seen for most techs.

Yudkowsky paints a “foom” picture of a world full of familiar weak stupid slowly improving computers, until suddenly and unexpectedly a single super-smart un-controlled AGI with very powerful general abilities appears and is able to decisively overwhelm all other powers on Earth. Alternatively, he claims (quite implausibly I think) that all AGIs naturally coordinate to merge into a single system to defeat competition-based checks and balances.

These folks seem to envision a few key discrete breakthrough insights that allow the first team that finds them to suddenly catapult their AI into abilities far beyond all other then-current systems. These would be big breakthroughs relative to the broad category of “mental tasks”, and thus even bigger than if we found big breakthroughs relative to the less broad tech categories of “energy”, “transport”, or “shelter”. Yes of course change is often lumpy if we look at small tech scopes, but lumpy local changes aggregate into smoother change over wider scopes.

As I’ve previously explained at length, that seems to me to postulate a quite unusual lumpiness relative to the history we’ve seen for innovation in general, and more particularly for tools, computers, AI, and even machine learning. And this seems to postulate much more of a lumpy conceptual essence to “betterness” than I find plausible. Recent machine learning systems today seem relatively close to each other in their abilities, are gradually improving, and none seem remotely inclined to mount a coup.

I don’t mind groups with small relative budgets exploring scenarios with proportionally small chances, but I lament such a large fraction of those willing to take the long term future seriously using this as their default AI scenario. And while I get why people like Yudkowsky focus on scenarios in which they fervently believe, I am honestly puzzled why so many AI risk experts seem to repudiate his extreme scenarios, and yet still see AI risk as a terribly important project to pursue right now. If AI isn’t unusually lumpy, then why are early efforts at AI control design especially valuable?

So far I’ve mentioned two widely expressed AI concerns. First, AIs may hurt human workers by displacing them, and second, AIs may start coups wherein they wrest control of some resources from their owners. A third widely expressed concern is that the world today may be stable, and contain value, only due to somewhat random and fragile configurations of culture, habits, beliefs, attitudes, institutions, values, etc. If so, our world may break if this stuff drifts out of a safe and stable range for such configurations. AI might be or facilitate such a change, and by helping to accelerate change, AI might accelerate the rate of configuration drift.

Similar concerns have often been expressed about allowing too many foreigners to immigrate into a society, or allowing the next youthful generation too much freedom to question and change inherited traditions. Or allowing many other specific transformative techs, like genetic engineering, fusion energy, social media, or space. Or other big social changes, like gay marriage.

Many have deep and reasonable fears regarding big long-term changes. And some seek to design AI so that it won’t allow excessive change. But this issue seems to me much more about change in general than about AI in particular. People focused on these concerns should be looking to stop or greatly limit and slow change in general, and not focus so much on AI. Big change can also happen without AI.

So what am I missing? Why would AI advances be so vastly more lumpy than prior tech advances as to justify very early control efforts? Or if not, why are AI risk efforts a priority now?

GD Star Rating
loading...
Tagged as: , ,

Will Design Escape Selection?

In the past, many people and orgs have had plans and designs, many which made noticeable differences to the details of history. But regarding most of history, our best explanations of overall trends has been in terms of competition and selection, including between organisms, species, cultures, nations, empires, towns, firms, and political factions.

However, when it comes to the future, especially hopeful futures, people tend to think more in terms of design than selection. For example, H.G. Wells was willing to rely on selection to predict a future dystopia in The Time Machine, but his utopia in Things to Come was the result of conscious planning replacing prior destructive competition. Hopeful futurists have long painted pictures of shiny designed techs, planned cities, and wise cooperative institutions of charity and governance.

Today, competition and selection continue on in many forms, including political competition for the control of governance institutions. But instead of seeing governance, law, and regulation as driven largely by competition between units of governance (e.g., parties, cities, or nations), many now prefer to see them in design terms: good people coordinating to choose how we want to live together, and to limit competition in many ways. They see competition between units of governance as largely passé, and getting more so as we establish stronger global communities and governance.

My future analysis efforts have relied mostly on competition and selection. Such as in Age of Em, post-em AI, Burning the Cosmic Commons, and Grabby Aliens. And in my predictions of long views and abstract values. Their competitive elements, and what that competition produces, are often described by others as dystopian. And the most common long-term futurist vision I come across these days is of a “singleton” artificial general intelligence (A.G.I.) for whom competition and selection become irrelevant. In that vision (of which I am skeptical), there is only one A.G.I., which has no internal conflicts, grows in power and wisdom via internal reflection and redesign, and then becomes all powerful and immortal, changing the universe to match its value vision.

Many recent historical trends (e.g., slavery, democracy, religion, fertility, leisure, war, travel, art, promiscuity) can be explained in terms of rising wealth inducing a reversion to forager values and attitudes. And I see these design-oriented attitudes toward governance and the future as part of this pro-forager trend. Foragers didn’t overtly compete with each other, but instead made important decisions by consensus, and largely by appeal to community-wide altruistic goals. The farming world forced humans to more embrace competition, and become more like our pre-human ancestors, but we were never that comfortable with it.

The designs that foragers created, however, were too small to reveal the key obstacle to this vision of civilization-wide collective design to over-rule competition: rot (see 1 2 3 4). Not only is it quite hard in practice to coordinate to overturn the natural outcomes of competition and selection, the sorts of complex structures that we are tempted to use to achieve that purpose consistently rot, and decay with time. If humanity succeeds in creating world governance strong enough to manage competition, those governance structures are likely to prevent interstellar colonization, as that strongly threatens their ability to prevent competition. And such structures would slowly rot over time, eventually dragging civilization down with them.

If competition and selection manages to continue, our descendants may become grabby aliens, and join the other gods at the end of time. In that case one of the biggest unanswered question is: what will be the key units of future selection? How will those units manage to coordinate, to the extent that they do, while still avoiding the rotting of their coordination mechanisms? And how can we now best promote the rise of the best versions of such competing units?

GD Star Rating
loading...
Tagged as: , ,

Why Age of Em Will Happen

In some technology competitions, winners dominate strongly. For example, while gravel may cover a lot of roads if we count by surface area, if we weigh by vehicle miles traveled then asphalt strongly dominates as a road material. Also, while some buildings are cooled via fans and very thick walls, the vast majority of buildings in rich and hot places use air-conditioning. In addition, current versions of software systems also tend to dominate over old older versions. (E.g., Windows 10 over Windows 8.)

However, in many other technology competitions, older technologies remain widely used over long periods. Cities were invented ten thousand years ago, yet today only about half of the population lives in them. Cars, trains, boats, and planes have taken over much transportation, yet we still do plenty of walking. Steel has replaced wood in many structures, yet wood is still widely used. Fur, wool, and cotton aren’t used as often as they once were, but they are still quite common as clothing materials. E-books are now quite popular, but paper books sales are still growing.

Whether or not an old tech still retains wide areas of substantial use depends on the average advantage of the new tech, relative to the variation of that advantage across the environments where these techs are used, and the variation within each tech category. All else equal, the wider the range of environments, and the more diverse is each tech category, the longer that old tech should remain in wide use.

For example, compare the set of techs that start with the letter A (like asphalt) to the set that start with the letter G (like gravel). As these are relatively arbitrary sets that do not “cut nature at its joints”, there is wide diversity within each category, and each set is all applied to a wide range of environments. This makes it quite unlikely that one of these sets will strongly dominate the other.

Note that techs that tend to dominate strongly, like asphalt, air-conditioning, and new software versions, more often appear as a lumpy change, e.g., all at once, rather than via a slow accumulation of many changes. That is, they more often result from one or a few key innovations, and have some simple essential commonality. In contrast, techs that have more internal variety and structure tend more to result from the accumulation of more smaller innovations.

Now consider the competition between humans and computers for mental work. Today human brains earn more than half of world income, far more than the costs of computer hardware and software. But over time, artificial hardware and software have been improving, and slowly commanding larger fractions. Eventually this could become a majority. And a key question is then: how quickly might computers come to dominate overwhelmingly, doing virtually all mental work?

On the one hand, the ranges here are truly enormous. We are talking about all mental work, which covers a very wide of environments. And not only do humans vary widely in abilities and inclinations, but computer systems seem to encompass an even wider range of designs and approaches. And many of these are quite complex systems. These facts together suggest that the older tech of human brains could last quite a long time (relative of course to relevant timescales) after computers came to do the majority of tasks (weighted by income), and that the change over that period could be relatively gradual.

For an analogy, consider the space of all possible non-mental work. While machines have surely been displacing humans for a long time in this area, we still do many important tasks “by hand”, and overall change has been pretty steady for a long time period. This change looked nothing like a single “general” machine taking over all the non-mental tasks all at once.

On the other hand, human minds are today stuck in old bio hardware that isn’t improving much, while artificial computer hardware has long been improving rapidly. Both these states, of hardware being stuck and improving fast, have been relatively uniform within each category and across environments. As a result, this hardware advantage might plausibly overwhelm software variety to make humans quickly lose most everywhere.

However, eventually brain emulations (i.e. “ems”) should be possible, after which artificial software would no longer have a hardware advantage over brain software; they would both have access to the same hardware. (As ems are an all-or-nothing tech that quite closely substitutes for humans and yet can have a huge hardware advantage, ems should displace most all humans over a short period.) At that point, the broad variety of mental task environments, and of approaches to both artificial and em software, suggests that ems many well stay competitive on many job tasks, and that this status might last a long time, with change being gradual.

Note also that as ems should soon become much cheaper than humans, the introduction of ems should initially cause a big reversion, wherein ems take back many of the mental job tasks that humans had recently lost to computers.

In January I posted a theoretical account that adds to this expectation. It explains why we should expect brain software to be a marvel of integration and abstraction, relative to the stronger reliance on modularity that we see in artificial software, a reliance that allows those systems to be smaller and faster built, but also causes them to rot faster. This account suggests that for a long time it would take unrealistically large investments for artificial software to learn to be as good as brain software on the tasks where brains excel.

A contrary view often expressed is that at some point someone will “invent” AGI (= Artificial General Intelligence). Not that society will eventually have broadly capable and thus general systems as a result of the world economy slowly collecting many specific tools and abilities over a long time. But that instead a particular research team somewhere will discover one or a few key insights that allow that team to quickly create a system that can do most all mental tasks much better than all the other systems, both human and artificial, in the world at that moment. This insight might quickly spread to other teams, or it might be hoarded to give this team great relative power.

Yes, under this sort of scenario it becomes more plausible that artificial software will either quickly displace humans on most all jobs, or do the same to ems if they exist at the time. But it is this scenario that I have repeatedly argued is pretty crazy. (Not impossible, but crazy enough that only a small minority should assume or explore it.) While the lumpiness of innovation that we’ve seen so far in computer science has been modest and not out of line with most other research fields, this crazy view postulates an enormously lumpy innovation, far out of line with anything we’ve seen in a long while. We have no good reason to believe that such a thing is at all likely.

If we presume that no one team will ever invent AGI, it becomes far more plausible that there will still be plenty of jobs tasks for ems to do, whenever ems show up. Even if working ems only collect 10% of world income soon after ems appear, the scenario I laid out in my book Age of Em is still pretty relevant. That scenario is actually pretty robust to such variations. As a result of thinking about these considerations, I’m now much more confident that the Age of Em will happen.

In Age of Em, I said:

Conditional on my key assumptions, I expect at least 30 percent of future situations to be usefully informed by my analysis. Unconditionally, I expect at least 5 percent.

I now estimate an unconditional 80% chance of it being a useful guide, and so will happily take bets based on a 50-50 chance estimate. My claim is something like:

Within the first D econ doublings after ems are as cheap as the median human worker, there will be a period where >X% of world income is paid for em work. And during that period Age of Em will be a useful guide to that world.

Note that this analysis suggests that while the arrival of ems might cause a relatively sudden and disruptive transition, the improvement of other artificial software would likely be more gradual. While overall rates of growth and change should increase as a larger fraction of the means of production comes to be made in factories, the risk is low of a sudden AI advance relative to that overall rate of change. Those concerned about risks caused by AI changes can more reasonably wait until we see clearer signs of problems.

GD Star Rating
loading...
Tagged as: , , ,

Agency Failure AI Apocalypse?

Years ago my ex-co-blogger Eliezer Yudkowsky and I argued here on this blog about his AI risk fear than an AI so small dumb & weak that few had ever heard of it, might without warning, suddenly “foom”, i.e., innovate very fast, and take over the world after one weekend. I mostly argued that we have a huge literature on economic growth at odds with this. Historically, the vast majority of innovation has been small, incremental, and spread across many industries and locations. Yes, humans mostly displaced other pre-human species, as such species can’t share innovations well. But since then sharing and complementing of innovations has allowed most of the world to gain from even the biggest lumpiest innovations ever seen. Eliezer said a deep conceptual analysis allowed him to see that this time is different.

Since then there’s been a vast increase in folks concerned about AI risk, many focused on scenarios like Yudkowsky’s. (But almost no interest in my critique.) In recent years I’ve heard many say they are now less worried about foom, but have new worries just as serious. Though I’ve found it hard to understand what worries could justify big efforts now, compared to later when we should know far more about powerful AI details. (E.g., worrying about cars, TV, or nukes in the year 1000 would have been way too early.)

Enter Paul Christiano (see also Vox summary). Paul says:

The stereotyped image of AI catastrophe is a powerful, malicious AI system that takes its creators by surprise and quickly achieves a decisive advantage over the rest of humanity. I think this is probably not what failure will look like, and I want to try to paint a more realistic picture. …

If I want to convince Bob to vote for Alice, I can experiment with many different persuasion strategies … Or I can build good predictive models of Bob’s behavior … These are powerful techniques for achieving any goal that can be easily measured over short time periods. But if I want to help Bob figure out whether he should vote for Alice—whether voting for Alice would ultimately help create the kind of society he wants—that can’t be done by trial and error. To solve such tasks we need to understand what we are doing and why it will yield good outcomes. …

It’s already much easier to pursue easy-to-measure goals, but machine learning will widen the gap by letting us try a huge number of possible strategies and search over massive spaces of possible actions. … Eventually our society’s trajectory will be determined by powerful optimization with easily-measurable goals rather than by human intentions about the future. …over time [our] proxies will come apart:

Corporations will deliver value to consumers as measured by profit. Eventually this mostly means manipulating consumers, capturing regulators, extortion and theft. Investors … instead of actually having an impact they will be surrounded by advisors who manipulate them into thinking they’ve had an impact. Law enforcement will drive down complaints and increase … a false sense of security, … As this world goes off the rails, there may not be any discrete point where consensus recognizes that things have gone off the rails. … human control over levers of power gradually becomes less and less effective; we ultimately lose any real ability to influence our society’s trajectory. …

Patterns that want to seek and expand their own influence—organisms, corrupt bureaucrats, companies obsessed with growth. … will tend to increase their own influence and so can come to dominate the behavior of large complex systems unless there is competition or a successful effort to suppress them. … a wide variety of goals could lead to influence-seeking behavior, … an influence-seeker would be aggressively gaming whatever standard you applied …

If influence-seeking patterns do appear and become entrenched, it can ultimately lead to a rapid phase transition … where humans totally lose control. … For example, an automated corporation may just take the money and run; a law enforcement system may abruptly start seizing resources and trying to defend itself from attempted decommission. … Eventually we reach the point where we could not recover from a correlated automation failure. (more)

While I told Yudkowsky his fear doesn’t fit with our large literature on economic growth, I’ll tell Christiano his fear doesn’t fit with our large (mostly economic) literature on agency failures (see 1 2 3 4 5).

An agent is someone you pay to assist you. You must always pay to get an agent who consumes real resources. But agents can earn extra “agency rents” when you and other possible agents can’t see everything that they know and do. And even if an agent doesn’t earn more rents, a more difficult agency relation can cause “agency failure”, wherein you get less of what you want from your agent.

Now like any agent, an AI who costs real resources must be paid. And depending on the market and property setup this could let AIs save, accumulate capital, and eventually collectively control most capital. This is an well-known AI concern, that AIs who are more useful than humans might earn more income, and thus become richer and more influential than humans. But this isn’t Christiano’s fear.

It is easy to believe that agent rents and failures generally scale roughly with the overall important and magnitude of activities. That is, when we do twice as much, and get roughly twice as much value out of it, we also lose about twice as much potential via agency failures, relative to a perfect agency relation, and the agents gain about twice as much in agency rents. So it is plausible to think that this also happens with AIs as they become more capable; we get more but then so do they, and more potential is lost.

Christiano instead fears that as AIs get more capable, the AIs will gain so much more agency rents, and we will suffer so much more due to agency failures, that we will actually become worse off as as result. And not just a bit worse off; we apparently get apocalypse level worse off! This sort of agency apocalypse is not only a far larger problem than we’d expect via simple scaling, it is also not supported anywhere I know of in the large academic literature on agency problems.

This literature has found many factors that influence the difficulty of agency relations. Agency tends to be harder when more relevant agent info and actions are hidden both to principals and other agents, when info about outcomes get noisier, when there is more noise in the mapping between effort and outcomes, when agents and principals are more impatient and risk averse, when agents are more unique, when principals can threaten more extreme outcomes, and when agents can more easily coordinate.

But this literature has not found that smarter agents are more problematic, all else equal. In fact, the economics literature that models agency problems typically assumes perfectly rational and thus infinitely smart agents, who reason exactly correctly in every possible situation. This typically results in limited and modest agency rents and failures.

For concreteness, imagine a twelve year old rich kid, perhaps a king or queen, seeking agents to help manage their wealth or kingdom. It is far from obvious that this child is on average worse off when they choose a smarter more capable agent, or when the overall pool of agents from which they can choose becomes smarter and more capable. And its even less obvious that the kid becomes maximally worse off as their agents get maximally smart and capable. In fact, I suspect the opposite.

Of course it remains possible that there is something special about the human-AI agency relation that can justify Christiano’s claims. But surely the burden of “proof” (really argument) should lie on those say this case is radically different from most found in our large and robust agency literatures. (Google Scholar lists 234K papers with keyword “principal-agent”.)

And even if AI agency problems turn out to be unusual severe, that still doesn’t justify trying to solve them so far in advance of knowing about their details.

GD Star Rating
loading...

How Lumpy AI Services?

Long ago people like Marx and Engels predicted that the familiar capitalist economy would naturally lead to the immiseration of workers, huge wealth inequality, and a strong concentration of firms. Each industry would be dominated by a main monopolist, and these monsters would merge into a few big firms that basically run, and ruin, everything. (This is somewhat analogous to common expectations that military conflicts naturally result in one empire ruling the world.)

Many intellectuals and ordinary people found such views quite plausible then, and still do; these are the concerns most often voiced to justify redistribution and regulation. Wealth inequality is said to be bad for social and political health, and big firms are said to be bad for the economy, workers, and consumers, especially if they are not loyal to our nation, or if they coordinate behind the scenes.

Note that many people seem much less concerned about an economy full of small firms populated by people of nearly equal wealth. Actions seem more visible in such a world, and better constrained by competition. With a few big privately-coordinating firms, in contrast, who knows that they could get up to, and they seem to have so many possible ways to screw us. Many people either want these big firms broken up, or heavily constrained by presumed-friendly regulators.

In the area of AI risk, many express great concern that the world may be taken over by a few big powerful AGI (artificial general intelligence) agents with opaque beliefs and values, who might arise suddenly via a fast local “foom” self-improvement process centered on one initially small system. I’ve argued in the past that such sudden local foom seems unlikely because innovation is rarely that lumpy.

In a new book-length technical report, Reframing Superintelligence: Comprehensive AI Services as General Intelligence, Eric Drexler makes a somewhat similar anti-lumpiness argument. But he talks about task lumpiness, not innovation lumpiness. Powerful AI is safer if it is broken into many specific services, often supplied by separate firms. The task that each service achieves has a narrow enough scope that there’s little risk of it taking over the world and killing everyone in order to achieve that task. In particular, the service of being competent at a task is separate from the service of learning how to become competent at that task. In Drexler’s words: Continue reading "How Lumpy AI Services?" »

GD Star Rating
loading...
Tagged as: , ,

Age of Em Paperback

Today is the official U.S. release date for the paperback version of my first book The Age of Em: Work, Love, and Life when Robots Rule the Earth. (U.K. version came out a month ago.) Here is the new preface:

I picked this book topic so it could draw me in, and I would finish. And that worked: I developed an obsession that lasted for years. But once I delivered the “final” version to my publisher on its assigned date, I found that my obsession continued. So I collected a long file of notes on possible additions. And when the time came that a paperback edition was possible, I grabbed my chance. As with the hardback edition, I had many ideas for changes that might make my dense semi-encyclopedia easier for readers to enjoy. But my core obsession again won out: to show that detailed analysis of future scenarios is possible, by showing just how many reasonable conclusions one can draw about this scenario.

Also, as this book did better than I had a right to expect, I wondered: will this be my best book ever? If so, why not make it the best it can be? The result is the book you now hold. It has over 42% more citations, and 18% more words, but it is only a bit easier to read. And now I must wonder: can my obsession stop now, pretty please?

Many are disappointed that I do not more directly declare if I love or hate the em world. But I fear that such a declaration gives an excuse to dismiss all this; critics could say I bias my analysis in order to get my desired value conclusions. I’ve given over 100 talks on this book, and never once has my audience failed to engage value issues. I remain confident that such issues will not be neglected, even if I remain quiet.

These are the only new sections in the paperback: Anthropomorphize, Motivation, Slavery, Foom, After Ems. (I previewed two of them here & here.)  I’ll make these two claims for my book:

  1. There’s at least a 5% chance that my analysis will usefully inform the real future, i.e., that something like brain emulations are actually the first kind of human-level machine intelligence, and my analysis is mostly right on what happens then. If it is worth having twenty books on the future, it is worth having a book with a good analysis of a 5% scenario.
  2. I know of no other analysis of a substantially-different-from-today future scenario that is remotely as thorough as Age of Em. I like to quip, “Age of Em is like science fiction, except there is no plot, no characters, and it all makes sense.” If you often enjoy science fiction but are frustrated that it rarely makes sense on closer examination, then you want more books like Age of Em. The success or not of Age of Em may influence how many future authors try to write such books.
GD Star Rating
loading...
Tagged as: ,

How Deviant Recent AI Progress Lumpiness?

I seem to disagree with most people working on artificial intelligence (AI) risk. While with them I expect rapid change once AI is powerful enough to replace most all human workers, I expect this change to be spread across the world, not concentrated in one main localized AI system. The efforts of AI risk folks to design AI systems whose values won’t drift might stop global AI value drift if there is just one main AI system. But doing so in a world of many AI systems at similar abilities levels requires strong global governance of AI systems, which is a tall order anytime soon. Their continued focus on preventing single system drift suggests that they expect a single main AI system.

The main reason that I understand to expect relatively local AI progress is if AI progress is unusually lumpy, i.e., arriving in unusually fewer larger packages rather than in the usual many smaller packages. If one AI team finds a big lump, it might jump way ahead of the other teams.

However, we have a vast literature on the lumpiness of research and innovation more generally, which clearly says that usually most of the value in innovation is found in many small innovations. We have also so far seen this in computer science (CS) and AI. Even if there have been historical examples where much value was found in particular big innovations, such as nuclear weapons or the origin of humans.

Apparently many people associated with AI risk, including the star machine learning (ML) researchers that they often idolize, find it intuitively plausible that AI and ML progress is exceptionally lumpy. Such researchers often say, “My project is ‘huge’, and will soon do it all!” A decade ago my ex-co-blogger Eliezer Yudkowsky and I argued here on this blog about our differing estimates of AI progress lumpiness. He recently offered Alpha Go Zero as evidence of AI lumpiness:

I emphasize how all the mighty human edifice of Go knowledge … was entirely discarded by AlphaGo Zero with a subsequent performance improvement. … Sheer speed of capability gain should also be highlighted here. … you don’t even need self-improvement to get things that look like FOOM. … the situation with AlphaGo Zero looks nothing like the Hansonian hypothesis and a heck of a lot more like the Yudkowskian one.

I replied that, just as seeing an unusually large terror attack like 9-11 shouldn’t much change your estimate of the overall distribution of terror attacks, nor seeing one big earthquake change your estimate of the overall distribution of earthquakes, seeing one big AI research gain like AlphaGo Zero shouldn’t much change your estimate of the overall distribution of AI progress. (Seeing two big lumps in a row, however, would be stronger evidence.) In his recent podcast with Sam Harris, Eliezer said:

Y: I have claimed recently on facebook that now that we have seen Alpha Zero, Alpha Zero seems like strong evidence against Hanson’s thesis for how these things necessarily go very slow because they have to duplicate all the work done by human civilization and that’s hard. …

H: What’s the best version of his argument, and then why is he wrong?

Y: Nothing can prepare you for Robin Hanson! Ha ha ha. Well, the argument that Robin Hanson has given is that these systems are still immature and narrow, and things will change when they get general. And my reply has been something like, okay, what changes your mind short of the world actually ending. If your theory is wrong do we get to find out about that at all before the world does.

(Sam didn’t raise the subject in his recent podcast with me.)

In this post, let me give another example (beyond two big lumps in a row) of what could change my mind. I offer a clear observable indicator, for which data should have available now: deviant citation lumpiness in recent ML research. One standard measure of research impact is citations; bigger lumpier developments gain more citations that smaller ones. And it turns out that the lumpiness of citations is remarkably constant across research fields! See this March 3 paper in Science:

The citation distributions of papers published in the same discipline and year lie on the same curve for most disciplines, if the raw number of citations c of each paper is divided by the average number of citations c0 over all papers in that discipline and year. The dashed line is a lognormal fit. …

The probability of citing a paper grows with the number of citations that it has already collected. Such a model can be augmented with … decreasing the citation probability with the age of the paper, and a fitness parameter, unique to each paper, capturing the appeal of the work to the scientific community. Only a tiny fraction of papers deviate from the pattern described by such a model.

It seems to me quite reasonable to expect that fields where real research progress is lumpier would also display a lumpier distribution of citations. So if CS, AI, or ML research is much lumpier than in other areas, we should expect to see that in citation data. Even if your hypothesis is that only ML research is lumpier, and only in the last 5 years, we should still have enough citation data to see that. My expectation, of course, is that recent ML citation lumpiness is not much bigger than in most research fields through history.

Added 24Mar: You might save the hypothesis that research areas vary greatly in lumpiness by postulating that the number of citations of each research advance goes as the rank of the “size” of that advance, relative to its research area. The distribution of ranks is always the same, after all. But this would be a surprising outcome, and hence seems unlikely; I’d want to see clear evidence that the distribution of lumpiness of advances varies greatly across fields.

Added 27Mar: More directly relevant might be data on distributions of patent value and citations. Do these distributions vary by topic? Are CS/AI/ML distributed more unequally?

GD Star Rating
loading...
Tagged as: , ,

On Value Drift

The outcomes within any space-time region can be seen as resulting from 1) preferences of various actors able to influence the universe in that region, 2) absolute and relative power and influence of those actors, and 3) constraints imposed by the universe. Changes in outcomes across regions result from changes in these factors.

While you might mostly approve of changes resulting from changing constraints, you might worry more about changes due to changing values and influence. That is, you likely prefer to see more influence by values closer to yours. Unfortunately, the consistent historical trend has been for values to drift over time, increasing the distance between random future and current values. As this trend looks like a random walk, we see no obvious limit to how far values can drift. So if the value you place on the values of others falls rapidly enough with the distance between values, you should expect long term future values to be very wrong.

What influences value change?
Inertia – The more existing values are tied to important entrenched systems, the less they change.
Growth – On average, over time civilization collects more total influence over most everything.
Competition – If some values consistently win key competitive contests, those values become more common.
Influence Drift – Many processes that change the world produce random drift in agent influence.
Internal Drift – Some creatures, e.g., humans, have values that drift internally in complex ways.
Culture Drift – Some creatures, e.g., humans, have values that change together in complex ways.
Context – Many of the above processes depend on other factors, such as technology, wealth, a stable sun, etc.

For many of the above processes, rates of change are roughly proportional to overall social rates of change. As these rates of change have been increased over time, we should expect faster future change. Thus you should expect values to drift faster in the future than then did in the past, leading faster to wrong values. Also, people are living longer now than they did in the past. So even past people didn’t live long enough to see big enough changes to greatly bother them, future people may live to see much more change.

Most increases in the rates of change have been concentrated in a few sudden large jumps (associated with the culture, farmer, and industry transitions). As a result, you should expect that rates of change may soon increase greatly. Value drift may continue at past rates until it suddenly goes much faster.

Perhaps you discount the future rapidly, or perhaps the value you place on other values falls slowly with value distance. In these cases value drift may not disturb you much. Otherwise, the situation described above may seem pretty dire. Even if previous generations had to accept the near inevitability of value drift, you might not accept it now. You may be willing to reach for difficult and dangerous changes that could remake the whole situation. Such as perhaps a world government. Personally I see that move as too hard and dangerous for now, but I could understand if you disagree.

The people today who seem most concerned about value drift also seem to be especially concerned about humans or ems being replaced by other forms of artificial intelligence. Many such people are also concerned about a “foom” scenario of a large and sudden influence drift: one initially small computer system suddenly becomes able to grow far faster than the rest of the world put together, allowing it to quickly take over the world.

To me, foom seems unlikely: it posits an innovation that is extremely lumpy compared to historical experience, and in addition posits an unusually high difficulty of copying or complementing this innovation. Historically, innovation value has been distributed with a long thin tail: most realized value comes from many small innovations, but we sometimes see lumpier innovations. (Alpha Zero seems only weak evidence on the distribution of AI lumpiness.) The past history of growth rates increases suggests that within a few centuries we may see something, perhaps a very lumpy innovation, that causes a growth rate jump comparable in size to the largest jumps we’ve ever seen, such as at the origins of life, culture, farming, and industry. However, as over history the ease of copying and complementing such innovations has been increasing, it seems unlikely that copying and complementing will suddenly get much harder.

While foom seems unlikely, it does seems likely that within a few centuries we will develop machines that can outcompete biological humans for most all jobs. (Such machines might also outcompete ems for jobs, though that outcome is much less clear.) The ability to make such machines seems by itself sufficient to cause a growth rate increase comparable to the other largest historical jumps. Thus the next big jump in growth rates need not be associated with a very lumpy innovation. And in the most natural such scenarios, copying and complementing remain relatively easy.

However, while I expect machines that outcompete humans for jobs, I don’t see how that greatly increases the problem of value drift. Human cultural plasticity already ensures that humans are capable of expressing a very wide range of values. I see no obviously limits there. Genetic engineering will allow more changes to humans. Ems inherit human plasticity, and may add even more via direct brain modifications.

In principle, non-em-based artificial intelligence is capable of expressing the entire space of possible values. But in practice, in the shorter run, such AIs will take on social roles near humans, and roles that humans once occupied. This should force AIs to express pretty human-like values. As Steven Pinker says:

Artificial intelligence is like any other technology. It is developed incrementally, designed to satisfy multiple conditions, tested before it is implemented, and constantly tweaked for efficacy and safety.

If Pinker is right, the main AI risk mediated by AI values comes from AI value drift that happens after humans (or ems) no longer exercise such detailed frequent oversight.

It may be possible to create competitive AIs with protected values, i.e., so that parts where values are coded are small, modular, redundantly stored, and insulated from changes to the rest of the system. If so, such AIs may suffer much less from internal drift and cultural drift. Even so, the values of AIs with protected values should still drift due to influence drift and competition.

Thus I don’t see why people concerned with value drift should be especially focused on AI. Yes, AI may accompany faster change, and faster change can make value drift worse for people with intermediate discount rates. (Though it seems to me that altruistic discount rates should scale with actual rates of change, not with arbitrary external clocks.)

Yes, AI offers more prospects for protected values, and perhaps also for creating a world/universe government capable of preventing influence drift and competition. But in these cases if you are concerned about value drift, your real concerns are about rates of change and world government, not AI per se. Even the foom scenario just temporarily increases the rate of influence drift.

Your real problem is that you want long term stability in a universe that more naturally changes. Someday we may be able to coordinate to overrule the universe on this. But I doubt we are close enough to even consider that today. To quote a famous prayer:

God, grant me the serenity to accept the things I cannot change,
Courage to change the things I can,
And wisdom to know the difference.

For now value drift seems one of those possibly lamentable facts of life that we cannot change.

GD Star Rating
loading...
Tagged as: , ,

Reply to Christiano on AI Risk

Paul Christiano was one of those who encouraged me to respond to non-foom AI risk concerns. Here I respond to two of his posts he directed me to. The first one says we should worry about the following scenario:

Imagine using [reinforcement learning] to implement a decentralized autonomous organization (DAO) which maximizes its profit. .. to outcompete human organizations at a wide range of tasks — producing and selling cheaper widgets, but also influencing government policy, extorting/manipulating other actors, and so on.

The shareholders of such a DAO may be able to capture the value it creates as long as they are able to retain effective control over its computing hardware / reward signal. Similarly, as long as such DAOs are weak enough to be effectively governed by existing laws and institutions, they are likely to benefit humanity even if they reinvest all of their profits.

But as AI improves, these DAOs would become much more powerful than their human owners or law enforcement. And we have no ready way to use a prosaic AGI to actually represent the shareholder’s interests, or to govern a world dominated by superhuman DAOs. In general, we have no way to use RL to actually interpret and implement human wishes, rather than to optimize some concrete and easily-calculated reward signal. I feel pessimistic about human prospects in such a world. (more)

In a typical non-foom world, if one DAO has advanced abilities, then most other organizations, including government and the law, have similar abilities. So such DAOs shouldn’t find it much easier to evade contracts or regulation than do organizations today. Thus humans can be okay if law and government still respect human property rights or political representation. Sure it might be hard to trust such a DAO to manage your charity, if you don’t trust it to judge who is in most need. But you might trust it much to give you financial returns on your financial investments in it.

Paul Christiano’s second post suggests that the arrival of AI arrives will forever lock in the distribution of patient values at that time:

The distribution of wealth in the world 1000 years ago appears to have had a relatively small effect—or more precisely an unpredictable effect, whose expected value was small ex ante—on the world of today. I think there is a good chance that AI will fundamentally change this dynamic, and that the distribution of resources shortly after the arrival of human-level AI may have very long-lasting consequences. ..

Whichever values were most influential at one time would remain most influential (in expectation) across all future times. .. The great majority of resources are held by extremely patient values. .. The development of machine intelligence may move the world much closer to this naïve model. .. [Because] the values of machine intelligences can (probably, eventually) be directly determined by their owners or predecessors. .. it may simply be possible to design a machine intelligence who exactly shares their predecessor’s values and who can serve as a manager. .. the arrival of machine intelligence may lead to a substantial crystallization of influence .. an event with long-lasting consequences. (more)

That is, Christiano says future AI won’t have problems preserving its values over time, nor need it pay agency costs to manage subsystems. Relatedly, Christiano elsewhere claims that future AI systems won’t have problems with design entrenchment:

Over the next 100 years greatly exceeds total output over all of history. I agree that coordination is hard, but even spending a small fraction of current effort on exploring novel redesigns would be enough to quickly catch up with stuff designed in the past.

A related claim, that Christiano supports to some degree, is that future AI are smart enough to avoid suffers from coordination failures. They may even use “acasual trade” to coordinate when physical interaction of any sort is impossible!

In our world, more competent social and technical systems tend to be larger and more complex, and such systems tend to suffer more (in % cost terms) from issues of design entrenchment, coordination failures, agency costs, and preserving values over time. In larger complex systems, it becomes harder to isolate small parts that encode “values”; a great many diverse parts end up influencing what such systems do in any given situation.

Yet Christiano expects the opposite for future AI; why? I fear his expectations result more from far view idealizations than from observed trends in real systems. In general, we see things far away in less detail, and draw inferences about them more from top level features and analogies than from internal detail. Yet even though we know less about such things, we are more confident in our inferences! The claims above seem to follow from the simple abstract description that future AI is “very smart”, and thus better in every imaginable way. This is reminiscent of medieval analysis that drew so many conclusions about God (including his existence) from the “fact” that he is “perfect.”

But even if values will lock in when AI arrives, and then stay locked, that still doesn’t justify great efforts to study AI control today, at least relative to the other options of improving our control mechanisms in general, or saving resources now to spend later, either on studying AI control problems when we know more about AI, or just to buy influence over the future when that comes up for sale.

GD Star Rating
loading...
Tagged as: , , ,

An Outside View of AI Control

I’ve written much on my skepticism of local AI foom (= intelligence explosion). Recently I said that foom offers the main justification I understand for AI risk efforts now, as well as being the main choice of my Twitter followers in a survey. It was the main argument offered by Eliezer Yudkowsky in our debates here at this blog, by Nick Bostrom in his book Superintelligence, and by Max Tegmark in his recent book Life 3.0 (though he denied so in his reply here).

However, some privately complained to me that I haven’t addressed those with non-foom-based AI concerns. So in this post I’ll consider AI control in the context of a prototypical non-em non-foom mostly-peaceful outside-view AI scenario. In a future post, I’ll try to connect this to specific posts by others on AI risk.

An AI scenario is where software does most all jobs; humans may work for fun, but they add little value. In a non-em scenario, ems are never feasible. As foom scenarios are driven by AI innovations that are very lumpy in time and organization, in non-foom scenarios innovation lumpiness is distributed more like it is in our world. In a mostly-peaceful scenario, peaceful technologies of production matter much more than do technologies of war and theft. And as an outside view guesses that future events are like similar past events, I’ll relate future AI control problems to similar past problems. Continue reading "An Outside View of AI Control" »

GD Star Rating
loading...
Tagged as: , ,