October 14, 2008

Ends Don't Justify Means (Among Humans)

Followup toWhy Does Power Corrupt?

"If the ends don't justify the means, what does?"
        -- variously attributed

"I think of myself as running on hostile hardware."
        -- Justin Corwin

Yesterday I talked about how humans may have evolved a structure of political revolution, beginning by believing themselves morally superior to the corrupt current power structure, but ending by being corrupted by power themselves - not by any plan in their own minds, but by the echo of ancestors who did the same and thereby reproduced.

This fits the template:

In some cases, human beings have evolved in such fashion as to think that they are doing X for prosocial reason Y, but when human beings actually do X, other adaptations execute to promote self-benefiting consequence Z.

From this proposition, I now move on to my main point, a question considerably outside the realm of classical Bayesian decision theory:

"What if I'm running on corrupted hardware?"

Continue reading "Ends Don't Justify Means (Among Humans)" »

October 13, 2008

Why Does Power Corrupt?

Followup toEvolutionary Psychology

"Power tends to corrupt, and absolute power corrupts absolutely.  Great men are almost always bad men."
        -- Lord Acton

Call it a just-so story if you must, but as soon as I was introduced to the notion of evolutionary psychology (~1995), it seemed obvious to me why human beings are corrupted by power.  I didn't then know that hunter-gatherer bands tend to be more egalitarian than agricultural tribes - much less likely to have a central tribal-chief boss-figure - and so I thought of it this way:

Humans (particularly human males) have evolved to exploit power and status when they obtain it, for the obvious reason:  If you use your power to take many wives and favor your children with a larger share of the meat, then you will leave more offspring, ceteris paribus.  But you're not going to have much luck becoming tribal chief if you just go around saying, "Put me in charge so that I can take more wives and favor my children."  You could lie about your reasons, but human beings are not perfect deceivers.

So one strategy that an evolution could follow, would be to create a vehicle that reliably tended to start believing that the old power-structure was corrupt, and that the good of the whole tribe required their overthrow...

Continue reading "Why Does Power Corrupt?" »

September 23, 2008

That Tiny Note of Discord

Followup toThe Sheer Folly of Callow Youth

When we last left Eliezer1997, he believed that any superintelligence would automatically do what was "right", and indeed would understand that better than we could; even though, he modestly confessed, he did not understand the ultimate nature of morality.  Or rather, after some debate had passed, Eliezer1997 had evolved an elaborate argument, which he fondly claimed to be "formal", that we could always condition upon the belief that life has meaning; and so cases where superintelligences did not feel compelled to do anything in particular, would fall out of consideration.  (The flaw being the unconsidered and unjustified equation of "universally compelling argument" with "right".)

So far, the young Eliezer is well on the way toward joining the "smart people who are stupid because they're skilled at defending beliefs they arrived at for unskilled reasons".  All his dedication to "rationality" has not saved him from this mistake, and you might be tempted to conclude that it is useless to strive for rationality.

But while many people dig holes for themselves, not everyone succeeds in clawing their way back out.

And from this I learn my lesson:  That it all began -

- with a small, small question; a single discordant note; one tiny lonely thought...

Continue reading "That Tiny Note of Discord" »

September 18, 2008

The Sheer Folly of Callow Youth

Followup toMy Childhood Death Spiral, My Best and Worst Mistake, A Prodigy of Refutation

"There speaks the sheer folly of callow youth; the rashness of an ignorance so abysmal as to be possible only to one of your ephemeral race..."
        -- Gharlane of Eddore

Once upon a time, years ago, I propounded a mysterious answer to a mysterious question - as I've hinted on several occasions.  The mysterious question to which I propounded a mysterious answer was not, however, consciousness - or rather, not only consciousness.  No, the more embarrassing error was that I took a mysterious view of morality.

I held off on discussing that until now, after the series on metaethics, because I wanted it to be clear that Eliezer1997 had gotten it wrong.

When we last left off, Eliezer1997, not satisfied with arguing in an intuitive sense that superintelligence would be moral, was setting out to argue inescapably that creating superintelligence was the right thing to do.

Well (said Eliezer1997) let's begin by asking the question:  Does life have, in fact, any meaning?

Continue reading "The Sheer Folly of Callow Youth" »

September 04, 2008

The Truly Iterated Prisoner's Dilemma

Followup toThe True Prisoner's Dilemma

For everyone who thought that the rational choice in yesterday's True Prisoner's Dilemma was to defect, a follow-up dilemma:

Suppose that the dilemma was not one-shot, but was rather to be repeated exactly 100 times, where for each round, the payoff matrix looks like this:

Humans: C Humans:  D
Paperclipper: C (2 million human lives saved, 2 paperclips gained) (+3 million lives, +0 paperclips)
Paperclipper: D (+0 lives, +3 paperclips) (+1 million lives, +1 paperclip)

As most of you probably know, the king of the classical iterated Prisoner's Dilemma is Tit for Tat, which cooperates on the first round, and on succeeding rounds does whatever its opponent did last time.  But what most of you may not realize, is that, if you know when the iteration will stop, Tit for Tat is - according to classical game theory - irrational.

Why?  Consider the 100th round.  On the 100th round, there will be no future iterations, no chance to retaliate against the other player for defection.  Both of you know this, so the game reduces to the one-shot Prisoner's Dilemma.  Since you are both classical game theorists, you both defect.

Now consider the 99th round.  Both of you know that you will both defect in the 100th round, regardless of what either of you do in the 99th round.  So you both know that your future payoff doesn't depend on your current action, only your current payoff.  You are both classical game theorists.  So you both defect.

Now consider the 98th round...

With humanity and the Paperclipper facing 100 rounds of the iterated Prisoner's Dilemma, do you really truly think that the rational thing for both parties to do, is steadily defect against each other for the next 100 rounds?

September 03, 2008

The True Prisoner's Dilemma

It occurred to me one day that the standard visualization of the Prisoner's Dilemma is fake.

The core of the Prisoner's Dilemma is this symmetric payoff matrix:

1: C 1:  D
2: C (3, 3) (5, 0)
2: D (0, 5) (2, 2)

Player 1, and Player 2, can each choose C or D.  1 and 2's utility for the final outcome is given by the first and second number in the pair.  For reasons that will become apparent, "C" stands for "cooperate" and D stands for "defect".

Observe that a player in this game (regarding themselves as the first player) has this preference ordering over outcomes:  (D, C) > (C, C) > (D, D) > (C, D).

D, it would seem, dominates C:  If the other player chooses C, you prefer (D, C) to (C, C); and if the other player chooses D, you prefer (D, D) to (C, D).  So you wisely choose D, and as the payoff table is symmetric, the other player likewise chooses D.

If only you'd both been less wise!  You both prefer (C, C) to (D, D).  That is, you both prefer mutual cooperation to mutual defection.

The Prisoner's Dilemma is one of the great foundational issues in decision theory, and enormous volumes of material have been written about it.  Which makes it an audacious assertion of mine, that the usual way of visualizing the Prisoner's Dilemma has a severe flaw, at least if you happen to be human.

Continue reading "The True Prisoner's Dilemma" »

August 29, 2008

Moral False Consensus

Social butterflies know less than they think about their friends' ethics:

[Researchers] asked groups of workers and business students about ethical dilemmas ... [seeking] evidence of "false consensus bias" -- that is, the tendency of people to project their values and behaviors onto others.  As the size of [individual social] networks grew, so did the extent at which individuals overestimated how many others would agree with them.  Why? People discuss "safe subjects in the workplace -- sports, kids, current events," the researchers wrote. So "little of the insights that people gain from social ties may apply" to moral dilemmas.

August 24, 2008

Magical Categories

Followup toAnthropomorphic Optimism, Superexponential Conceptspace, The Hidden Complexity of Wishes, Unnatural Categories

'We can design intelligent machines so their primary, innate emotion is unconditional love for all humans.  First we can build relatively simple machines that learn to recognize happiness and unhappiness in human facial expressions, human voices and human body language.  Then we can hard-wire the result of this learning as the innate emotional values of more complex intelligent machines, positively reinforced when we are happy and negatively reinforced when we are unhappy.'
        -- Bill Hibbard (2001), Super-intelligent machines.

That was published in a peer-reviewed journal, and the author later wrote a whole book about it, so this is not a strawman position I'm discussing here.

So... um... what could possibly go wrong...

When I mentioned (sec. 6) that Hibbard's AI ends up tiling the galaxy with tiny molecular smiley-faces, Hibbard wrote an indignant reply saying:

'When it is feasible to build a super-intelligence, it will be feasible to build hard-wired recognition of "human facial expressions, human voices and human body language" (to use the words of mine that you quote) that exceed the recognition accuracy of current humans such as you and me, and will certainly not be fooled by "tiny molecular pictures of smiley-faces." You should not assume such a poor implementation of my idea that it cannot make discriminations that are trivial to current humans.'

Continue reading "Magical Categories" »

August 23, 2008

Unnatural Categories

Followup toDisguised Queries, Superexponential Conceptspace

If a tree falls in the forest, and no one hears it, does it make a sound?

"Tell me why you want to know," says the rationalist, "and I'll tell you the answer."  If you want to know whether your seismograph, located nearby, will register an acoustic wave, then the experimental prediction is "Yes"; so, for seismographic purposes, the tree should be considered to make a sound.  If instead you're asking some question about firing patterns in a human auditory cortex - for whatever reason - then the answer is that no such patterns will be changed when the tree falls.

What is a poison?  Hemlock is a "poison"; so is cyanide; so is viper venom.  Carrots, water, and oxygen are "not poison".  But what determines this classification?  You would be hard pressed, just by looking at hemlock and cyanide and carrots and water, to tell what sort of difference is at work.  You would have to administer the substances to a human - preferably one signed up for cryonics - and see which ones proved fatal.  (And at that, the definition is still subtler than it appears: a ton of carrots, dropped on someone's head, will also prove fatal. You're really asking about fatality from metabolic disruption, after administering doses small enough to avoid mechanical damage and blockage, at room temperature, at low velocity.)

Where poison-ness is concerned, you are not classifying via a strictly local property of the substance.  You are asking about the consequence when a dose of that substance is applied to a human metabolism.  The local difference between a human who gasps and keels over, versus a human alive and healthy, is more compactly discriminated, than any local difference between poison and non-poison.

Continue reading "Unnatural Categories" »

August 22, 2008

Mirrors and Paintings

Followup toSorting Pebbles Into Correct Heaps, Invisible Frameworks

Background: There's a proposal for Friendly AI called "Coherent Extrapolated Volition" which I don't really want to divert the discussion to, right now.  Among many other things, CEV involves pointing an AI at humans and saying (in effect) "See that?  That's where you find the base content for self-renormalizing morality."

Hal Finney commented on the Pebblesorter parable:

I wonder what the Pebblesorter AI would do if successfully programmed to implement [CEV]...  Would the AI pebblesort?  Or would it figure that if the Pebblesorters got smarter, they would see that pebblesorting was pointless and arbitrary?  Would the AI therefore adopt our own parochial morality, forbidding murder, theft and sexual intercourse among too-young people?  Would that be the CEV of Pebblesorters?

I imagine we would all like to think so, but it smacks of parochialism, of objective morality.  I can't help thinking that Pebblesorter CEV would have to include some aspect of sorting pebbles.  Doesn't that suggest that CEV can malfunction pretty badly?

I'm giving this question its own post, for that it touches on similar questions I once pondered - dilemmas that forced my current metaethics as the resolution.

Yes indeed:  A CEV-type AI, taking Pebblesorters as its focus, would wipe out the Pebblesorters and sort the universe into prime-numbered heaps.

This is not the right thing to do.

That is not a bug.

Continue reading "Mirrors and Paintings" »

Search

December 2008

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31