Tag Archives: Future

Reply to Christiano on AI Risk

Paul Christiano was one of those who encouraged me to respond to non-foom AI risk concerns. Here I respond to two of his posts he directed me to. The first one says we should worry about the following scenario:

Imagine using [reinforcement learning] to implement a decentralized autonomous organization (DAO) which maximizes its profit. .. to outcompete human organizations at a wide range of tasks — producing and selling cheaper widgets, but also influencing government policy, extorting/manipulating other actors, and so on.

The shareholders of such a DAO may be able to capture the value it creates as long as they are able to retain effective control over its computing hardware / reward signal. Similarly, as long as such DAOs are weak enough to be effectively governed by existing laws and institutions, they are likely to benefit humanity even if they reinvest all of their profits.

But as AI improves, these DAOs would become much more powerful than their human owners or law enforcement. And we have no ready way to use a prosaic AGI to actually represent the shareholder’s interests, or to govern a world dominated by superhuman DAOs. In general, we have no way to use RL to actually interpret and implement human wishes, rather than to optimize some concrete and easily-calculated reward signal. I feel pessimistic about human prospects in such a world. (more)

In a typical non-foom world, if one DAO has advanced abilities, then most other organizations, including government and the law, have similar abilities. So such DAOs shouldn’t find it much easier to evade contracts or regulation than do organizations today. Thus humans can be okay if law and government still respect human property rights or political representation. Sure it might be hard to trust such a DAO to manage your charity, if you don’t trust it to judge who is in most need. But you might trust it much to give you financial returns on your financial investments in it.

Paul Christiano’s second post suggests that the arrival of AI arrives will forever lock in the distribution of patient values at that time:

The distribution of wealth in the world 1000 years ago appears to have had a relatively small effect—or more precisely an unpredictable effect, whose expected value was small ex ante—on the world of today. I think there is a good chance that AI will fundamentally change this dynamic, and that the distribution of resources shortly after the arrival of human-level AI may have very long-lasting consequences. ..

Whichever values were most influential at one time would remain most influential (in expectation) across all future times. .. The great majority of resources are held by extremely patient values. .. The development of machine intelligence may move the world much closer to this naïve model. .. [Because] the values of machine intelligences can (probably, eventually) be directly determined by their owners or predecessors. .. it may simply be possible to design a machine intelligence who exactly shares their predecessor’s values and who can serve as a manager. .. the arrival of machine intelligence may lead to a substantial crystallization of influence .. an event with long-lasting consequences. (more)

That is, Christiano says future AI won’t have problems preserving its values over time, nor need it pay agency costs to manage subsystems. Relatedly, Christiano elsewhere claims that future AI systems won’t have problems with design entrenchment:

Over the next 100 years greatly exceeds total output over all of history. I agree that coordination is hard, but even spending a small fraction of current effort on exploring novel redesigns would be enough to quickly catch up with stuff designed in the past.

A related claim, that Christiano supports to some degree, is that future AI are smart enough to avoid suffers from coordination failures. They may even use “acasual trade” to coordinate when physical interaction of any sort is impossible!

In our world, more competent social and technical systems tend to be larger and more complex, and such systems tend to suffer more (in % cost terms) from issues of design entrenchment, coordination failures, agency costs, and preserving values over time. In larger complex systems, it becomes harder to isolate small parts that encode “values”; a great many diverse parts end up influencing what such systems do in any given situation.

Yet Christiano expects the opposite for future AI; why? I fear his expectations result more from far view idealizations than from observed trends in real systems. In general, we see things far away in less detail, and draw inferences about them more from top level features and analogies than from internal detail. Yet even though we know less about such things, we are more confident in our inferences! The claims above seem to follow from the simple abstract description that future AI is “very smart”, and thus better in every imaginable way. This is reminiscent of medieval analysis that drew so many conclusions about God (including his existence) from the “fact” that he is “perfect.”

But even if values will lock in when AI arrives, and then stay locked, that still doesn’t justify great efforts to study AI control today, at least relative to the other options of improving our control mechanisms in general, or saving resources now to spend later, either on studying AI control problems when we know more about AI, or just to buy influence over the future when that comes up for sale.

GD Star Rating
Tagged as: , , ,

An Outside View of AI Control

I’ve written much on my skepticism of local AI foom (= intelligence explosion). Recently I said that foom offers the main justification I understand for AI risk efforts now, as well as being the main choice of my Twitter followers in a survey. It was the main argument offered by Eliezer Yudkowsky in our debates here at this blog, by Nick Bostrom in his book Superintelligence, and by Max Tegmark in his recent book Life 3.0 (though he denied so in his reply here).

However, some privately complained to me that I haven’t addressed those with non-foom-based AI concerns. So in this post I’ll consider AI control in the context of a prototypical non-em non-foom mostly-peaceful outside-view AI scenario. In a future post, I’ll try to connect this to specific posts by others on AI risk.

An AI scenario is where software does most all jobs; humans may work for fun, but they add little value. In a non-em scenario, ems are never feasible. As foom scenarios are driven by AI innovations that are very lumpy in time and organization, in non-foom scenarios innovation lumpiness is distributed more like it is in our world. In a mostly-peaceful scenario, peaceful technologies of production matter much more than do technologies of war and theft. And as an outside view guesses that future events are like similar past events, I’ll relate future AI control problems to similar past problems. Continue reading "An Outside View of AI Control" »

GD Star Rating
Tagged as: , ,

Humans Cells In Multicellular Future Minds?

In general, adaptive systems vary along an axis from general to specific. A more general system works better (either directly or after further adaptation) in a wider range of environments, and also with a wider range of other adapting systems. It does this in part via having more useful modularity and abstraction. In contrast, a more specific system adapts to a narrower range of specific environments and other subsystems.

Systems that we humans consciously design tend to be more general, i.e., less context dependent, relative to the “organic” systems that they often replace. For example, compare grid-like city street plans to locally evolved city streets, national retail outlets to locally arising stores and restaurants, traditional to permaculture farms, hotel rooms to private homes, big formal firms to small informal teams, uniforms to individually-chosen clothes, and refactored to un-refactored software. The first entity in each pair tends to more easily scale and to match more environments, while the second in each pair tends to be adapted in more detail to particular local conditions. Continue reading "Humans Cells In Multicellular Future Minds?" »

GD Star Rating
Tagged as: ,

Tegmark’s Book of Foom

Max Tegmark says his new book, Life 3.0, is about what happens when life can design not just its software, as humans have done in Life 2.0, but also its hardware:

Life 1.0 (biological stage) evolves its hardware and software
Life 2.0 (cultural stage) evolves its hardware, designs much of its software
Life 3.0 (technological stage): designs its hardware and software ..
Many AI researchers think that Life 3.0 may arrive during the coming century, perhaps even during our lifetime, spawned by progress in AI. What will happen, and what will this mean for us? That’s the topic of this book. (29-30)

Actually, its not. The book says little about redesigning hardware. While it says interesting things on many topics, its core is on a future “singularity” where AI systems quickly redesign their own software. (A scenario sometimes called “foom”.)

The book starts out with a 19 page fictional “scenario where humans use superintelligence to take over the world.” A small team, apparently seen as unthreatening by the world, somehow knows how to “launch” a “recursive self-improvement” in a system focused on “one particular task: programming AI Systems.” While initially “subhuman”, within five hours it redesigns its software four times and becomes superhuman at its core task, and so “could also teach itself all other humans skills.”

After five more hours and redesigns it can make money by doing half of the tasks at Amazon Mechanical Turk acceptably well. And it does this without having access to vast amounts of hardware or to large datasets of previous performance on such tasks. Within three days it can read and write like humans, and create world class animated movies to make more money. Over the next few months it goes on to take over the news media, education, world opinion, and then the world. It could have taken over much faster, except that its human controllers were careful to maintain control. During this time, no other team on Earth is remotely close to being able to do this.

Tegmark later explains: Continue reading "Tegmark’s Book of Foom" »

GD Star Rating
Tagged as: , ,

Can Human-Like Software Win?

Many, perhaps most, think it obvious that computer-like systems will eventually be more productive than human-like systems in most all jobs. So they focus on how humans might maintain control, even after this transition. But this eventuality is less obvious than it seems, depending on what exactly one means by “human-like” or “computer-like” systems. Let me explain.

Today the software that sits in human brains is stuck in human brain hardware, while the other kinds of software that we write (or train) sit in the artificial hardware that we make. And this artificial hardware has been improving rapidly far more rapidly than has human brain hardware. Partly as a result of this, systems of artificial software and hardware have been improving rapidly compared to human brain systems.

But eventually we will find a way to transfer the software from human brains into artificial hardware. Ems are one way to do this, as a relatively direct port. But other transfer mechanics may be developed.

Once human brain software is in the same sort of artificial computing hardware as all the other software, then the relative productivity of different software categories comes down to a question of quality: which categories of software tend to be more productive on which tasks?

Of course there will many different variations available within each category, to match to different problems. And the overall productivity of each category will depend both on previous efforts to develop and improve software in that category, and also on previous investments in other systems to match and complement that software. For example, familiar artificial software will gain because we have spent longer working to match it to familiar artificial hardware, while human software will gain from being well matched to complex existing social systems, such as language, firms, law, and government.

People give many arguments for why they expect human-like software to mostly lose this future competition, even when it has access to the same hardware. For example, they say that other software could lack human biases and also scale better, have more reliable memory, communicate better over wider scopes, be easier to understand, have easier meta-control and self-modification, and be based more directly on formal abstract theories of learning, decision, computation, and organization.

Now consider two informal polls I recently gave my twitter followers:

Surprisingly, at least to me, the main reason that people expect human-like software to lose is that they mostly expect whole new categories of software to appear, categories quite different from both the software in the human brain and also all the many kinds of software with which we are now familiar. If it comes down to a contest between human-like and familiar software categories, only a quarter of them expect human-like to lose big.

The reason I find this surprising is that all of the reasons that I’ve seen given for why human-like software could be at a disadvantage seem to apply just as well to familiar categories of software. In addition, a new category must start with the disadvantages of having less previous investment in that category and in matching other systems to it. That is, none of these are reasons to expect imagined new categories of software to beat familiar artificial software, and yet people offer them as reasons to think whole new much more powerful categories will appear and win.

I conclude that people don’t mostly use specific reasons to conclude that human-like software will lose, once it can be moved to artificial hardware. Instead they just have a general belief that the space of possible software is huge and contains many new categories to discover. This just seems to be the generic belief that competition and innovation will eventually produce a lot of change. Its not that human-like software has any overall competitive disadvantage compared to concrete known competitors; it is at least as likely to have winning descendants as any such competitors. Its just that our descendants are likely to change a lot as they evolve over time. Which seems to me a very different story than the humans-are-sure-to-lose story we usually hear.

GD Star Rating
Tagged as: , ,

A Post-Em-Era Hint

A few months ago I noticed a pattern across the past eras of forager, farmer industry: each era has a major cycle (ice ages, empires rise & fall, business cycle) with a period of about one third of that era’s doubling time. So I tentatively suggested that a em future might also have a major cycle of roughly one third of its doubling time. If that economic doubling time is about a month, the em major cycle period might be about a week.

Now I report another pattern, to be treated similarly. In roughly the middle of each past era, a pair of major innovations in calculating and communicating appeared, and gradually went from barely existing to having big social impacts.

  • Forager: At unknown periods during the roughly two million year forager era, humanoids evolved reasoning and language. That is, we became able to think about and say many complex things to each other, including our reasons for and against claims.
  • Farmer: While the farming era lasted roughly 7 to 10 millennia, the first known writing was 5 millennia ago, and the first known math textbooks 4 millennia ago. About 2.5 millennia ago writing became widespread enough to induce major religious changes worldwide.
  • Industry: While the industry era has lasted roughly 16 to 24 decades, depending on how you count, the telegraph was developed 18 decades ago, and the wholesale switch from mechanical to digital electronic communication happened 4 to 6 decades ago. The idea of the computer was described 20 decades ago, the first digital computer was made 7 decades ago, and computers became widespread roughly 3 decades ago.

Note that innovations in calculation and communication were not independent, but instead intertwined with and enabled each other. Note also that these innovations did not change the growth rate of the world economy at the time; each era continued doubling at the same rate as before. But these innovations still seem essential to enabling the following era. It is hard to imagine farming before language and reasoning, nor industry before math and writing, nor ems before digital computers and communication.

This pattern weakly suggests that another pair of key innovations in calculation and communication may appear and then grow in importance across a wide middle of the em era. This era may only last a year or two in objective time, though typical ems may experience millennia during this time.

This innovation pair would be interdependent, not change the growth rate, and perhaps enable a new era to follow. I can think of two plausible candidates:

  1. Ems might discover a better language for expressing and manipulating something like brain states. This could help ems to share their thoughts and use auxiliary hardware to help calculate useful thoughts.
  2. Ems might develop analogues to combinatorial prediction markets, and thus better share beliefs and aggregate information on a wide range of topics.

(Or maybe the innovation produces some combination of these.) Again, these are crude speculations based on a weak inference from a rough pattern in only three data points. But even so, they give us a vague hint about what an age after ems might look like. And such hints are actually pretty hard to find.

GD Star Rating
Tagged as: , , ,

A Tangled Task Future

Imagine that you want to untangle a pile of cables. It wasn’t tangled on purpose; tangling just resulted naturally from how these cables were used. You’d probably look for the least tangled cable in the least tangled part of the pile, and start to work there. In this post I will argue that, in a nutshell, this is how we are slowly automating our world of work: we are un- and re-tangling it.

This has many implications, including for the long-term future of human-like creatures in a competitive world. But first we have a bit of explaining to do. Continue reading "A Tangled Task Future" »

GD Star Rating
Tagged as: , ,

Mormon Transhumanists

A standard trope of science fiction has religious groups using violence to stop a new technology. Perhaps because of this, many are surprised by the existence of religious transhumanists. Saturday I gave a keynote talk on Age of Em at the Mormon Transhumanist Association (MTA) annual conference, and had a chance to study such folks in more detail. And I should say right off the top that this MTA audience, compared to other audiences, had notably fewer morality or religious related objections to my em scenario.

I’m not surprised by the existence of religious tech futurists. Overall, the major world religions have been quite successful in adapting to the many social changes since most of them first appeared many millennia ago. Also, the main predictor of interest in tech futurism and science fiction is an interest in science and technology, and religious folks are not underrepresented there. Even so, you might ask what your favorite theories of religion predict about how MTA folk would differ from other transhumanists.

The most obvious difference I saw is that MTA does community very well, with good organization, little shirking, and lots of polite, respectful, and friendly interaction. This makes sense. Mormons in general have strong community norms, and one of the main functions of religion is to build strong communities. Mormonism is a relatively high commitment religion, and those tend to promote stronger bonds.

Though I did not anticipate it, a predictable consequence of this is that MTA is more of a transhuman take on Mormonism than a Mormon take on transhumanism. On reflection, this reveals an interesting way that long-lived groups with dogmas retain and co-op smart intellectuals. Let me explain.

One standard sales technique is to try to get your mark to spend lots of time considering your product. This is a reason why salespeople often seem so slow and chatty. The more time you spend considering their product, the longer that you will estimate it will take to consider other products, and the more likely you are to quit searching and take their product.

Similarly, religions often expose children to a mass of details, as in religious stories. Smart children can be especially engaged by these details because they like to show off their ability to remember and understand detail. Later on, such people can show off their ability to interpret these details in many ways, and to identify awkward and conflicting elements.

Even if the conflicts they find are so severe as to reasonably call into question the entire thing, by that time such people have invested so much in learning details of their religion that they’d lose a lot of ability to show off if they just left and never talked about it again. Some become vocally against their old religion, which lets them keep talking and showing off about it. But even in opposition, they are still then mostly defined by that religion.

I didn’t meet any MTA who took Mormon claims on miraculous historical events literally. They seemed well informed on science and tech and willing to apply typical engineering and science standards to such things. Even so, MTA folks are so focused on their own Mormon world that they tend to be less interested in asking how Mormons could anticipate and prepare for future changes, and more interested in how future/sci/tech themes could reframe and interpret key Mormon theological debates and claims. In practice their strong desire to remain Mormons in good standing means that they mostly accept practical church authority, including the many ways that the church hides the awkward and conflicting elements of its religions stories and dogma.

For example, MTA folks exploring a “new god argument” seek scenarios wherein we might live in a simulation that resonate with Mormon claims of a universe full of life and gods. While these folks aren’t indifferent to the relative plausibility of hypotheses, this sort of exercise is quite different from just asking what sort of simulations would be most likely if we in fact did live in a simulation.

I’ve said that we today live in an unprecedented dreamtime of unadaptive behavior, a dream from which some will eventually awake. Religious folks in general tend to be better positioned to awake sooner, as they have stronger communities, more self-control, and higher fertility. But even if the trope applies far more in fiction than in reality, it remains possible that Mormon religious orthodoxy could interfere with Mormons adapting to the future.

MTA could help to deal with such problems by becoming trusted guides to the future for other Mormons. To fill that role, they would of course need to show enough interest in Mormon theology to convince the others that they are good Mormons. But they would also need to pay more attention to just studying the future regardless of its relevance to Mormon theology. Look at what is possible, what is likely, and the consequences of various actions. For their sakes, I hope that they can make this adjustment.

By the way, we can talk similarly about libertarians who focus on criticizing government regulation and redistribution. The more one studies the details of government actions, showing off via knowing more such detail, then even if one mostly criticizes such actions, still one’s thinking becomes mostly defined by government. To avoid this outcome, focus more on thinking about what non-government organizations should do and how. It isn’t enough to say “without government, the market will do it.” Become part of a market that does things.

GD Star Rating
Tagged as: , ,

Better Babblers

You can think of knowing how to write as knowing how to correlate words. Given no words, what first word should you write. Then given one word, what second word best correlates with that. Then given two words, what third word best fits with those two. And so on. Thus your knowledge of how to write can be broken into what you know at these different correlation orders: one word, two words, three words, and so on. Each time you pick a new word you can combine knowledge at these different orders, by weighing all their different recommendations for your next word.

This correlation order approach can also be applied at different scales. For example, given some classification of your first sentence, what kind of second sentence should follow? Given a classification of your first chapter, what kind of second chapter should follow? Many other kinds of knowledge can be similarly broken down into correlation orders, at different scales. We can do this for music, paintings, interior decoration, computer programs, math theorems, and so on.

Given a huge database, such as of writings, it is easy to get good at very low orders; you can just use the correlation frequencies found in your dataset. After that, simple statistical models applied to this database can give you good estimates for correlations to use at somewhat higher orders. And if you have enough data (roughly ten million examples per category I’m told) then recently popular machine learning techniques can improve your estimates at a next set of higher orders.

There are some cases where this is enough; either you can get enormous amounts of data, or learning low order correlations well is enough to solve your problem. These cases include many games with well defined rules, many physical tasks where exact simulations are feasible, and some kinds of language translation. But there are still many other cases where this is far from enough to achieve human level proficiency. In these cases an important part of what we know can be described as very high order correlations produced by “deep” knowledge structures that aren’t well reduced to low order correlations.

After eighteen years of being a professor, I’ve graded many student essays. And while I usually try to teach a deep structure of concepts, what the median student actually learns seems to mostly be a set of low order correlations. They know what words to use, which words tend to go together, which combinations tend to have positive associations, and so on. But if you ask an exam question where the deep structure answer differs from answer you’d guess looking at low order correlations, most students usually give the wrong answer.

Simple correlations also seem sufficient to capture most polite conversation talk, such as the weather is nice, how is your mother’s illness, and damn that other political party. Simple correlations are also most of what I see in inspirational TED talks, and when public intellectuals and talk show guests pontificate on topics they really don’t understand, such as quantum mechanics, consciousness, postmodernism, or the need always for more regulation everywhere. After all, media entertainers don’t need to understand deep structures any better than do their audiences.

Let me call styles of talking (or music, etc.) that rely mostly on low order correlations “babbling”. Babbling isn’t meaningless, but to ignorant audiences it often appears to be based on a deeper understanding than is actually the case. When done well, babbling can be entertaining, comforting, titillating, or exciting. It just isn’t usually a good place to learn deep insight.

As we slowly get better at statistics and machine learning, our machines will slowly get better at babbling. The famous Eliza chatbot went surprisingly far using very low order correlations, and today chatbots best fool us into thinking they are human when they stick to babbling style conversations. So what does a world of better babblers look like?

First, machines will better mimic low quality student essays, so schools will have to try harder to keep such students from using artificial babblers.

Second, the better machines get at babbling, the more humans will try to distinguish themselves from machines via non-babbling conversational styles. So expect less use of simple easy-to-understand-and-predict speech in casual polite conversation, inspirational speeches, and public intellectual talk.

One option is to put a higher premium on talk that actually makes deep sense, in terms of deep concepts that experts understand. That would be nice for those of us who have always emphasized such things. But alas there are other options.

A second option is to put a higher premium on developing very distinctive styles of talking. This would be like how typical popular songs from two centuries ago could be sung and enjoyed by most anyone, compared to how popular music today is matched in great detail to the particular features of particular artists. Imagine most all future speakers having as distinct a personal talking style.

A third option is more indirect, ironic, and insider style talk, such as we tend to see on Twitter today. People using words and phrases and cultural references in ways that only folks very near in cultural space can clearly accept as within recent local fashion. Artificial babblers might not have enough data to track changing fashions in such narrow groups.

Bottom line: the more kinds of conversation styles that simple machines can manage, the more humans will try to avoid talking in those styles, a least when not talking to machines.

GD Star Rating
Tagged as: , ,

The Robot Protocol

Talking with a professor of robotics, I noticed a nice approachable question at the intersection of social science, computer science, and futurism.

Someday robots will mix with humans in public, walking our streets, parks, hospitals, and stores, driving our streets, swimming our waterways, and perhaps flying our skies. Such public robots may vary enormously in their mental and physical capacities, but if they are to mix smoothly with humans in public they then we will probably expect them to maintain a minimal set of common social capacities. Such as responding sensibly to “Who are you?” and “Get out of my way.” And the rest of us would have a new modified set of social norms for dealing with public robots via these capacities.

Together these common robot capacities and matching human social norms would become a “robot protocol.” Once ordinary people and robots makers have adapted to it, this protocol would be a standard persisting across space and time, and relatively hard to change. A standard that diverse robots could also use when interacting with each other in public.

Because it would be a wide and persistent standard, the robot protocol can’t be matched in much detail to the specific local costs of implementing various robot capacities. Instead, it could at best be matched to broad overall trends in such costs. To allow robots to walk among us, we’d try to be forgiving and only expect robots to have capacities that we especially value, and that are relatively cheap to implement in a wide range of contexts.

(Of course this general robot protocol isn’t the only thing that would coordinate robot and human interactions. There’d also be many other more context-dependent protocols.)

One simple option would be to expect each public robot to be “tethered” via fast robust communication to a person on call who can rapidly respond to all queries that the robot can’t handle itself. But it isn’t clear how sufficient this approach will be for many possible queries.

Robots would probably be expected to find and comply with any publicly posted rules for interacting in particular spaces, such as the rules we often post for humans on signs. Perhaps we will simplify such rules for robots. In addition, here are some things that people sometimes say to each other in public where we might perhaps want robots to have analogous capacities:

Who are you? What are you doing here? Why are you following me? Please don’t record me. I’m serving you with this legal warrant. Stop, this is the police! You are not allowed to be here; leave. Non-authorized personnel must evacuate this area immediately. Get out of my way. You are hurting me. Why are you calling attention to me? Can you help me? Can you take our picture? Where is the nearest bathroom? Where is a nearby recharging station? (I may add more here.)

It seems feasible to start now to think about the design of such a robot protocol. Of course in the end a robot protocol might be just a social convention without the force of law, and it may result more from decentralized evolution than centralized design. Even so, we may now know enough about human social preferences and the broad outlines of the costs of robot capacities to start to usefully think about this problem.

GD Star Rating
Tagged as: , ,