On Value Drift

The outcomes within any space-time region can be seen as resulting from 1) preferences of various actors able to influence the universe in that region, 2) absolute and relative power and influence of those actors, and 3) constraints imposed by the universe. Changes in outcomes across regions result from changes in these factors.

While you might mostly approve of changes resulting from changing constraints, you might worry more about changes due to changing values and influence. That is, you likely prefer to see more influence by values closer to yours. Unfortunately, the consistent historical trend has been for values to drift over time, increasing the distance between random future and current values. As this trend looks like a random walk, we see no obvious limit to how far values can drift. So if the value you place on the values of others falls rapidly enough with the distance between values, you should expect long term future values to be very wrong.

What influences value change?
Inertia – The more existing values are tied to important entrenched systems, the less they change.
Growth – On average, over time civilization collects more total influence over most everything.
Competition – If some values consistently win key competitive contests, those values become more common.
Influence Drift – Many processes that change the world produce random drift in agent influence.
Internal Drift – Some creatures, e.g., humans, have values that drift internally in complex ways.
Culture Drift – Some creatures, e.g., humans, have values that change together in complex ways.
Context – Many of the above processes depend on other factors, such as technology, wealth, a stable sun, etc.

For many of the above processes, rates of change are roughly proportional to overall social rates of change. As these rates of change have been increased over time, we should expect faster future change. Thus you should expect values to drift faster in the future than then did in the past, leading faster to wrong values. Also, people are living longer now than they did in the past. So even past people didn’t live long enough to see big enough changes to greatly bother them, future people may live to see much more change.

Most increases in the rates of change have been concentrated in a few sudden large jumps (associated with the culture, farmer, and industry transitions). As a result, you should expect that rates of change may soon increase greatly. Value drift may continue at past rates until it suddenly goes much faster.

Perhaps you discount the future rapidly, or perhaps the value you place on other values falls slowly with value distance. In these cases value drift may not disturb you much. Otherwise, the situation described above may seem pretty dire. Even if previous generations had to accept the near inevitability of value drift, you might not accept it now. You may be willing to reach for difficult and dangerous changes that could remake the whole situation. Such as perhaps a world government. Personally I see that move as too hard and dangerous for now, but I could understand if you disagree.

The people today who seem most concerned about value drift also seem to be especially concerned about humans or ems being replaced by other forms of artificial intelligence. Many such people are also concerned about a “foom” scenario of a large and sudden influence drift: one initially small computer system suddenly becomes able to grow far faster than the rest of the world put together, allowing it to quickly take over the world.

To me, foom seems unlikely: it posits an innovation that is extremely lumpy compared to historical experience, and in addition posits an unusually high difficulty of copying or complementing this innovation. Historically, innovation value has been distributed with a long thin tail: most realized value comes from many small innovations, but we sometimes see lumpier innovations. (Alpha Zero seems only weak evidence on the distribution of AI lumpiness.) The past history of growth rates increases suggests that within a few centuries we may see something, perhaps a very lumpy innovation, that causes a growth rate jump comparable in size to the largest jumps we’ve ever seen, such as at the origins of life, culture, farming, and industry. However, as over history the ease of copying and complementing such innovations has been increasing, it seems unlikely that copying and complementing will suddenly get much harder.

While foom seems unlikely, it does seems likely that within a few centuries we will develop machines that can outcompete biological humans for most all jobs. (Such machines might also outcompete ems for jobs, though that outcome is much less clear.) The ability to make such machines seems by itself sufficient to cause a growth rate increase comparable to the other largest historical jumps. Thus the next big jump in growth rates need not be associated with a very lumpy innovation. And in the most natural such scenarios, copying and complementing remain relatively easy.

However, while I expect machines that outcompete humans for jobs, I don’t see how that greatly increases the problem of value drift. Human cultural plasticity already ensures that humans are capable of expressing a very wide range of values. I see no obviously limits there. Genetic engineering will allow more changes to humans. Ems inherit human plasticity, and may add even more via direct brain modifications.

In principle, non-em-based artificial intelligence is capable of expressing the entire space of possible values. But in practice, in the shorter run, such AIs will take on social roles near humans, and roles that humans once occupied. This should force AIs to express pretty human-like values. As Steven Pinker says:

Artificial intelligence is like any other technology. It is developed incrementally, designed to satisfy multiple conditions, tested before it is implemented, and constantly tweaked for efficacy and safety.

If Pinker is right, the main AI risk mediated by AI values comes from AI value drift that happens after humans (or ems) no longer exercise such detailed frequent oversight.

It may be possible to create competitive AIs with protected values, i.e., so that parts where values are coded are small, modular, redundantly stored, and insulated from changes to the rest of the system. If so, such AIs may suffer much less from internal drift and cultural drift. Even so, the values of AIs with protected values should still drift due to influence drift and competition.

Thus I don’t see why people concerned with value drift should be especially focused on AI. Yes, AI may accompany faster change, and faster change can make value drift worse for people with intermediate discount rates. (Though it seems to me that altruistic discount rates should scale with actual rates of change, not with arbitrary external clocks.)

Yes, AI offers more prospects for protected values, and perhaps also for creating a world/universe government capable of preventing influence drift and competition. But in these cases if you are concerned about value drift, your real concerns are about rates of change and world government, not AI per se. Even the foom scenario just temporarily increases the rate of influence drift.

Your real problem is that you want long term stability in a universe that more naturally changes. Someday we may be able to coordinate to overrule the universe on this. But I doubt we are close enough to even consider that today. To quote a famous prayer:

God, grant me the serenity to accept the things I cannot change,
Courage to change the things I can,
And wisdom to know the difference.

For now value drift seems one of those possibly lamentable facts of life that we cannot change.

GD Star Rating
Tagged as: , ,
Trackback URL:
  • I think we are concerned with value drift “for the wrong reasons”. If key values change for random internal and external reasons it is worse than if they change because of complex deliberations. If values change because a particular device fooms rather than some collective cultural process, we get concerned. These two examples hint that we do not like arbitrary value drift: it has to happen for some moral-like reason embedding it in the sphere of (joint) autonomous and authentic decisions.

    One can argue that many cultural changes we have seen in the past don’t count as this – reducing sexism was hardly a clear decision but rather the outcome of a lot of arguments, social forces, behavioural copying, and other stuff. Maybe we will see an erosion in (say) politeness for the same complex reasons and we would regard it as a change not grounded in the sphere of autonomy and authenticity (but that might just be status quo bias talking).

    There might be more or less valid ways of changing values. But I suspect the validity of cultural and value change is also something that is strongly value-dependent. Current values think winning the marketplace of ideas is valid but imposition of values by authority isn’t, but this has not always been held and might drift. So a key question is meta-value drift: are there important attractors or stable states for the value drift management values?

    • Yes we might approve of future value changes if we choose them now. But most value change in the past is not the sort of which previous generations would have approved.

  • Cambias

    You’re making the huge assumption that values drift infinitely and/or randomly: that there are no consistent “values attractors” or vectors within that (metaphorical) space which constrain what values are possible, or eliminate outliers.

    I would argue that there are strong “values attractors” — wealth, happiness, safety, group cohesion, psychological well-being, and doubtless others. These remain quite consistent. “Don’t murder people” is one. “Don’t take other people’s stuff” is another. “Do unto others as you would have them do unto you” sums up most of them nicely.

    In short: morality isn’t arbitrary.

    This both drives concern about value drift — because “police the boundaries of acceptable values” is a long-standing value itself — and suggests that there are sound practical reasons to think about AI morality.

    • I listed both random and systematic causes of change. The examples you give of systematic factors fall under the competition factor I listed. Competition will also constraint AI values, so that isn’t a reason to treat them differently.

      • lump1

        But you did say that value drift is a random walk – something you probably shouldn’t claim given what you discussed below that.

      • arch1

        However one buckets them, there is broad longstanding consensus (though obviously not full adherence in practice) on the kinds of value attractors Cambias lists.

        Unfortunately, it is far from clear that this consensus will survive an AI transition

        In the face of this risk, which is qualitatively different from anything in previous transitions, a fatalistic or laissez faire attitude is not appropriate.

      • Thats the main argument I hear: AI is “qualitatively different.” Without much of an explanation of how and why that’s relevant.

      • arch1

        Because w/ the AI transition (unlike w/ previous ones), it’s far from clear that the critically important consensus mentioned above will survive.

      • arch1

        ..and perhaps the most important reason for *that* is that after the AI transition, the entities calling the shots may no longer be evolved humans.

      • Michael Wengler

        The values chosen by competition should be stroll dependent on the nature of the creature. Even our closest relatives the chimps seem to have somewhat different values than us. AI is quite different than us in how it reproduces. SO probably doesn’t even look like a single species and certainly not like us. Our values have us great other species even more poorly than we treat each other. What if AI optimizes to treat us the way we treat wolves or rats?

  • Paul

    a. People concerned with AI value drift are very often concerned with em plasticity, locked in world government, and genetic engineering value drift. Quantitatively those look like (much) smaller sources of expected drift (in part because AI looks quite likely to come first), which is why people don’t talk about them nearly as much. So this may just be your usual disagreement with futurists over AI timelines.

    b. From behind any reasonable veil of ignorance, “more random value drift amongst human values” benefits me about as much as it hurts me. Much less so for drift from human values to AI values. Similarly, we have strong decision-theoretic reasons to fight less about shifting influence amongst existing actors than about the introduction of new actors that most object to.

    c. I disagree with your implicit claims about past rates of value drift. I mostly care about the conclusions people would reach after an extensive period of deliberation (/ a long iterated chain f choosing how to influence the future), rather than the stance they would currently take on object level moral values. I think that’s both what matters empirically to determining outcomes, and what captures my actual preferences. I don’t believe these values are changing much over time, but they would change radically in the AI scenario.

    d. Put another way, I’m basically happy to endorse continued value drift of the kind we’ve had over the last 10k years. Your post rests crucially on the undesirability of these changes, but that’s a contentious claim that you don’t argue for. I believe that most people now would say “Historical value changes have been good, and so while I might be offended by the details of future changes I’m happy to accept wherever they are going.” As quick examples: I think that religious people like me in the past would be happy that we’ve basically figured out whether God exists and stopped believing in the possible worlds where it doesn’t. I don’t think that my on-reflection answer to how much to respect ancestors or non-human animals is materially different from the on-reflection answer of people like me in previous generations.

    e. (see arguments #1 – #4 here: https://rationalaltruist.com/2013/06/13/against-moral-advocacy/, which apply just as well to value drift).

    • a. I accept that believing AI comes early can make one focus more on AI based scenarios. b. As most human value drift is mediated by new humans, I don’t see why new humans with new values are less of a problem than new AIs with new values. c,d. I just don’t see the basis for your claims that real down deep human values have hardly changed but that AI values naturally change greatly. What is the evidence that you could persuade a 10Kyr ancestor to approve most of our value choices, and yet could not similarly persuade an AI?

      • Paul

        b. I benefit from yesteryear’s value drift. I agree value drift to new humans is worse than influence drift amongst existing humans.

        c, d. For example, this would happen if humans didn’t endorse the dependence of their values on cultural contingencies. People don’t go quite that far, but many go a long ways in that direction. No similar mechanisms prevent AI value drift.

        I’m not sure what evidence you’d find convincing here. Survey answers to the question “If future societies come to different views than modern societies about moral questions, how happy are you to let them make important decisions based on their views / how much would you prefer it if they deferred to our current views?”? I think most people would endorse future views in this question, and even more strongly for individual view evolution. I think most people would say something similar about AI, but in that case there are empirical arguments you can make about AI that will (typically) change their minds, and most will agree (after argument) that whether we should defer does depend on how we build the AI.

      • arch1

        “As most human value drift is mediated by new humans, I don’t see why new humans with new values are less of a problem than new AIs with new values”

        The nature of new humans is much more like previous humans, *and* much less malleable, than that of AIs. Critical values of the kind listed by Cambias are thus at much higher risk of going by the wayside in an AI transition, than in business-as-usual (or even previous-transition) handoffs among human generations.

      • Again with the bald claims, lacking supporting evidence.

      • arch1

        Robin, are you questioning that an entity’s critical values are tied to its nature, or that human nature is less malleable than AIs’ natures will be?

        As to the overall question of the risk of the AI transition to our critical values, the burden of the proof should be on those claiming that the risk is manageable thus not deserving of extraordinary focus.

      • Yes I am question the relative degree of value plasticity of humans vs AI. I didn’t say value drift isn’t an important issue.

      • arch1

        The AI transition will be a time of great upheaval in any case.

        Times of great upheaval test value systems, even ones tethered to an underlying stable evolved nature.

        In the AI transition, that tether will be much weaker; unlike human values, AI values can change with the next SW update or parameter revision. And it is not clear what would compensate for the weakening of the tether.

        So the likelihood of AI values changing (thus the risk to our most critical values) is higher in the AI transition than in previous human experience.

  • lump1

    Marx was basically right in saying that technology broadly determines ideology, and I think this is at least a coarsely predictive theory. In his time, Bentham was a complete oddball ethically. Today the industrialized world has adopted his views on women, slaves, animals and homosexuality. I claim that this could have been predicted by someone who became informed about the “means of production” of the next two centuries. Values do drift, but they respond to material realities in a lawful and predictable way.

    Another stabilizing influence on values has to do with human nature – something that Marx underestimated. We just have a certain moral psychology, and it changes at the speed of evolution, which is orders of magnitude slower than the speed of culture. Hanson’s “Age of Em” does a pretty plausible Marxist extrapolation of future values, expecting the ems to match our moral psychology. That’s a defensible assumption. Social science can’t just scramble all the variables at once. All the same, moral psychology might start to change much faster than evolution-speed in an em-world. It’s just code, and code can be overwritten. There will be strong motives for doing so, just like there would now be a market for pills that block feelings of guilt, shame, regret, alienation and humiliation.

    I suspect that many people are not troubled by normal value drift because of its lawfulness and predictability. Those same people might understandably freak out about what values could do if overwriting moral psychology became an option. The once-predictable process would become completely unmoored.

    • As future material realities are hard to foresee, so are value changes that respond to them. I don’t see why that makes people more comfortable with such value changes.

      • lump1

        Funny, this topic was addressed in a book by a guy named Robin Hanson, who felt pretty comfortable with using social science to predict that technological realities will cause values of ems to drift toward those of human farmers. I don’t think that he saw the future as some hazy question mark. If anything, he was downright preachy about how we can and should use the insights from the social sciences to extrapolate credible predictions about how we will live, love and value in the future. Anyway, that guy was right. Some material realities of the future will surely surprise us, but our moral psychology is hard-wired enough to produce values that responsible academics can responsibly speculate about.

  • Joe

    It seems to me that the claim that humans would converge on a set of true values given enough information and time to reflect on it makes (at least in its explicit argument) the unjustified assumption that values depend only on information. That is, the argument is that people hear more and better arguments that make them change their minds, and eventually after hearing all the arguments they converge on the truth – in this case the things they truly value, in their heart of hearts.

    The problem is that values depend on all sorts of factors, not just info. So while the above argument is just about plausible as a defense of values chosen in light of more info, I don’t see how you can claim any other relevant factor as producing truer values. Are true human values produced by rich humans, or poor? High status humans, or low status? Angry humans, sad humans, calm humans? Working or relaxing humans? And, well, why?

    • I think you are disagreeing with someone else, not me.

      • Joe

        Yes, sorry, to be clear this was meant as a response to Paul’s comments below, his claim c) here in particular. Though it applies to similar arguments others have made before, e.g. Yudkowsky’s CEV idea.

  • Charlene Cobleigh Soreff

    “As this trend looks like a random walk, we see no obvious limit to how far values can drift.”

    But a random walk in a space of what dimensionality?
    Drifts towards the left or the right eventually reverse themselves –
    but in a high dimensional space, one never stumbles home again…

  • Boris Borcic

    Human nature also constrains values. The most elementary relate to the prohibition of murder and a duty to not let people starve to death — which are actually related, since you can’t expect whom is refused the basic needed to survive would respect any social contract.

    If there possibly existed exceptions through history, I think it’s pretty clear that the above are spontaneous default values…

    …OTOH, the speech of “space-time regions” reminds me of a complex of ideas I’d developed around the end of the cold war and the first Gulf war. A matter of associating the speeches of leaders about “lesser evil” to concrete measures of aggregate evil. You need “space-time regions” to specify the set of events whose evil you’d aggregate.

    The measures of evil I had in mind, where (1) the aggregate of casualties, regardless of side (2) a function of the kill ratio, given the consideration that when the kill ratio diverges, the situation morphs from one of war (with an ingredient of legitimate self-defense) to one of genocide (with no ingredient of legitimate self-defense).

    • Obviously humans have long ago escaped any constraint not to murder to let people starve to death. Naturally rare events are not much constrained by “default values”.