Category Archives: Bayesian

Measuring Optimization Power

Previously in seriesAiming at the Target

Yesterday I spoke of how "When I think you’re a powerful intelligence, and I think I know something about your preferences, then I’ll predict that you’ll steer reality into regions that are higher in your preference ordering."

You can quantify this, at least in theory, supposing you have (A) the agent or optimization process’s preference ordering, and (B) a measure of the space of outcomes – which, for discrete outcomes in a finite space of possibilities, could just consist of counting them – then you can quantify how small a target is being hit, within how large a greater region.

Then we count the total number of states with equal or greater rank in the preference ordering to the outcome achieved, or integrate over the measure of states with equal or greater rank.  Dividing this by the total size of the space gives you the relative smallness of the target – did you hit an outcome that was one in a million?  One in a trillion?

Actually, most optimization processes produce "surprises" that are exponentially more improbable than this – you’d need to try far more than a trillion random reorderings of the letters in a book, to produce a play of quality equalling or exceeding Shakespeare.  So we take the log base two of the reciprocal of the improbability, and that gives us optimization power in bits.

This figure – roughly, the improbability of an "equally preferred" outcome being produced by a random selection from the space (or measure on the space) – forms the foundation of my Bayesian view of intelligence, or to be precise, optimization power.  It has many subtleties:

Continue reading "Measuring Optimization Power" »

GD Star Rating

Aiming at the Target

Previously in seriesBelief in Intelligence

Previously, I spoke of that very strange epistemic position one can occupy, wherein you don’t know exactly where Kasparov will move on the chessboard, and yet your state of knowledge about the game is very different than if you faced a random move-generator with the same subjective probability distribution – in particular, you expect Kasparov to win.  I have beliefs about where Kasparov wants to steer the future, and beliefs about his power to do so.

Well, and how do I describe this knowledge, exactly?

In the case of chess, there’s a simple function that classifies chess positions into wins for black, wins for white, and drawn games.  If I know which side Kasparov is playing, I know the class of chess positions Kasparov is aiming for.  (If I don’t know which side Kasparov is playing, I can’t predict whether black or white will win – which is not the same as confidently predicting a drawn game.)

More generally, I can describe motivations using a preference ordering. When I consider two potential outcomes, X and Y, I can say that I prefer X to Y; prefer Y to X; or find myself indifferent between them. I would write these relations as X > Y; X < Y; and X ~ Y.

Suppose that you have the ordering A < B ~ C < D ~ E. Then you like B more than A, and C more than A.  {B, C}, belonging to the same class, seem equally desirable to you; you are indifferent between which of {B, C} you receive, though you would rather have either than A, and you would rather have something from the class {D, E} than {B, C}.

When I think you’re a powerful intelligence, and I think I know something about your preferences, then I’ll predict that you’ll steer reality into regions that are higher in your preference ordering.

Continue reading "Aiming at the Target" »

GD Star Rating

Expected Creative Surprises

Imagine that I’m playing chess against a smarter opponent.  If I could predict exactly where my opponent would move on each turn, I would automatically be at least as good a chess player as my opponent.  I could just ask myself where my opponent would move, if they were in my shoes; and then make the same move myself.  (In fact, to predict my opponent’s exact moves, I would need to be superhuman – I would need to predict my opponent’s exact mental processes, including their limitations and their errors.  It would become a problem of psychology, rather than chess.)

So predicting an exact move is not possible, but neither is it true that I have no information about my opponent’s moves.

Personally, I am a very weak chess player – I play an average of maybe two games per year.  But even if I’m playing against former world champion Garry Kasparov, there are certain things I can predict about his next move.  When the game starts, I can guess that the move P-K4 is more likely than P-KN4.  I can guess that if Kasparov has a move which would allow me to checkmate him on my next move, that Kasparov will not make that move.

Much less reliably, I can guess that Kasparov will not make a move that exposes his queen to my capture – but here, I could be greatly surprised; there could be a rationale for a queen sacrifice which I have not seen.

And finally, of course, I can guess that Kasparov will win the game…

Continue reading "Expected Creative Surprises" »

GD Star Rating

My Bayesian Enlightenment

Followup toThe Magnitude of His Own Folly

I remember (dimly, as human memories go) the first time I self-identified as a "Bayesian".  Someone had just asked a malformed version of an old probability puzzle, saying:

If I meet a mathematician on the street, and she says, "I have two children, and at least one of them is a boy," what is the probability that they are both boys?

In the correct version of this story, the mathematician says "I have two children", and you ask, "Is at least one a boy?", and she answers "Yes".  Then the probability is 1/3 that they are both boys.

But in the malformed version of the story – as I pointed out – one would common-sensically reason:

If the mathematician has one boy and one girl, then my prior probability for her saying ‘at least one of them is a boy’ is 1/2 and my prior probability for her saying ‘at least one of them is a girl’ is 1/2.  There’s no reason to believe, a priori, that the mathematician will only mention a girl if there is no possible alternative.

So I pointed this out, and worked the answer using Bayes’s Rule, arriving at a probability of 1/2 that the children were both boys.  I’m not sure whether or not I knew, at this point, that Bayes’s rule was called that, but it’s what I used.

And lo, someone said to me, "Well, what you just gave is the Bayesian answer, but in orthodox statistics the answer is 1/3.  We just exclude the possibilities that are ruled out, and count the ones that are left, without trying to guess the probability that the mathematician will say this or that, since we have no way of really knowing that probability – it’s too subjective."

I responded – note that this was completely spontaneous – "What on Earth do you mean?  You can’t avoid assigning a probability to the mathematician making one statement or another.  You’re just assuming the probability is 1, and that’s unjustified."

To which the one replied, "Yes, that’s what the Bayesians say.  But frequentists don’t believe that."

And I said, astounded: "How can there possibly be such a thing as non-Bayesian statistics?"

Continue reading "My Bayesian Enlightenment" »

GD Star Rating

Horrible LHC Inconsistency

Followup to: When (Not) To Use Probabilities, How Many LHC Failures Is Too Many?

While trying to answer my own question on "How Many LHC Failures Is Too Many?" I realized that I’m horrendously inconsistent with respect to my stated beliefs about disaster risks from the Large Hadron Collider.

First, I thought that stating a "one-in-a-million" probability for the Large Hadron Collider destroying the world was too high, in the sense that I would much rather run the Large Hadron Collider than press a button with a known 1/1,000,000 probability of destroying the world.

But if you asked me whether I could make one million statements of authority equal to "The Large Hadron Collider will not destroy the world", and be wrong, on average, around once, then I would have to say no.

Unknown pointed out that this turns me into a money pump.  Given a portfolio of a million existential risks to which I had assigned a "less than one in a million probability", I would rather press the button on the fixed-probability device than run a random risk from this portfolio; but would rather take any particular risk in this portfolio than press the button.

Then, I considered the question of how many mysterious failures at the LHC it would take to make me question whether it might destroy the world/universe somehow, and what this revealed about my prior probability.

If the failure probability had a known 50% probability of occurring from natural causes, like a quantum coin or some such… then I suspect that if I actually saw that coin come up heads 20 times in a row, I would feel a strong impulse to bet on it coming up heads the next time around.  (And that’s taking into account my uncertainty about whether the anthropic principle really works that way.)

Even having noticed this triple inconsistency, I’m not sure in which direction to resolve it!

(But I still maintain my resolve that the LHC is not worth expending political capital, financial capital, or our time to shut down; compared with using the same capital to worry about superhuman intelligence or nanotechnology.)

GD Star Rating


"However many ways there may be of being alive, it is certain that there are vastly more ways of being dead."
        — Richard Dawkins

In the coming days, I expect to be asked:  "Ah, but what do you mean by ‘intelligence’?"  By way of untangling some of my dependency network for future posts, I here summarize some of my notions of "optimization".

Consider a car; say, a Toyota Corolla.  The Corolla is made up of some number of atoms; say, on the rough order of 1029.  If you consider all possible ways to arrange 1029 atoms, only an infinitesimally tiny fraction of possible configurations would qualify as a car; if you picked one random configuration per Planck interval, many ages of the universe would pass before you hit on a wheeled wagon, let alone an internal combustion engine.

Even restricting our attention to running vehicles, there is an astronomically huge design space of possible vehicles that could be composed of the same atoms as the Corolla, and most of them, from the perspective of a human user, won’t work quite as well.  We could take the parts in the Corolla’s air conditioner, and mix them up in thousands of possible configurations; nearly all these configurations would result in a vehicle lower in our preference ordering, still recognizable as a car but lacking a working air conditioner.

So there are many more configurations corresponding to nonvehicles, or vehicles lower in our preference ranking, than vehicles ranked greater than or equal to the Corolla.

Similarly with the problem of planning, which also involves hitting tiny targets in a huge search space.  Consider the number of possible legal chess moves versus the number of winning moves.

Which suggests one theoretical way to measure optimization – to quantify the power of a mind or mindlike process:

Continue reading "Optimization" »

GD Star Rating

The Truly Iterated Prisoner’s Dilemma

Followup toThe True Prisoner’s Dilemma

For everyone who thought that the rational choice in yesterday’s True Prisoner’s Dilemma was to defect, a follow-up dilemma:

Suppose that the dilemma was not one-shot, but was rather to be repeated exactly 100 times, where for each round, the payoff matrix looks like this:

Humans: C Humans:  D
Paperclipper: C (2 million human lives saved, 2 paperclips gained) (+3 million lives, +0 paperclips)
Paperclipper: D (+0 lives, +3 paperclips) (+1 million lives, +1 paperclip)

As most of you probably know, the king of the classical iterated Prisoner’s Dilemma is Tit for Tat, which cooperates on the first round, and on succeeding rounds does whatever its opponent did last time.  But what most of you may not realize, is that, if you know when the iteration will stop, Tit for Tat is – according to classical game theory – irrational.

Why?  Consider the 100th round.  On the 100th round, there will be no future iterations, no chance to retaliate against the other player for defection.  Both of you know this, so the game reduces to the one-shot Prisoner’s Dilemma.  Since you are both classical game theorists, you both defect.

Now consider the 99th round.  Both of you know that you will both defect in the 100th round, regardless of what either of you do in the 99th round.  So you both know that your future payoff doesn’t depend on your current action, only your current payoff.  You are both classical game theorists.  So you both defect.

Now consider the 98th round…

With humanity and the Paperclipper facing 100 rounds of the iterated Prisoner’s Dilemma, do you really truly think that the rational thing for both parties to do, is steadily defect against each other for the next 100 rounds?

GD Star Rating

The True Prisoner’s Dilemma

It occurred to me one day that the standard visualization of the Prisoner’s Dilemma is fake.

The core of the Prisoner’s Dilemma is this symmetric payoff matrix:

1: C 1:  D
2: C (3, 3) (5, 0)
2: D (0, 5) (2, 2)

Player 1, and Player 2, can each choose C or D.  1 and 2’s utility for the final outcome is given by the first and second number in the pair.  For reasons that will become apparent, "C" stands for "cooperate" and D stands for "defect".

Observe that a player in this game (regarding themselves as the first player) has this preference ordering over outcomes:  (D, C) > (C, C) > (D, D) > (C, D).

D, it would seem, dominates C:  If the other player chooses C, you prefer (D, C) to (C, C); and if the other player chooses D, you prefer (D, D) to (C, D).  So you wisely choose D, and as the payoff table is symmetric, the other player likewise chooses D.

If only you’d both been less wise!  You both prefer (C, C) to (D, D).  That is, you both prefer mutual cooperation to mutual defection.

The Prisoner’s Dilemma is one of the great foundational issues in decision theory, and enormous volumes of material have been written about it.  Which makes it an audacious assertion of mine, that the usual way of visualizing the Prisoner’s Dilemma has a severe flaw, at least if you happen to be human.

Continue reading "The True Prisoner’s Dilemma" »

GD Star Rating

Can Counterfactuals Be True?

Followup toProbability is Subjectively Objective

The classic explanation of counterfactuals begins with this distinction:

  1. If Lee Harvey Oswald didn’t shoot John F. Kennedy, then someone else did.
  2. If Lee Harvey Oswald hadn’t shot John F. Kennedy, someone else would have.

In ordinary usage we would agree with the first statement, but not the second (I hope).

If, somehow, we learn the definite fact that Oswald did not shoot Kennedy, then someone else must have done so, since Kennedy was in fact shot.

But if we went back in time and removed Oswald, while leaving everything else the same, then – unless you believe there was a conspiracy – there’s no particular reason to believe Kennedy would be shot:

We start by imagining the same historical situation that existed in 1963 – by a further act of imagination, we remove Oswald from our vision – we run forward the laws that we think govern the world – visualize Kennedy parading through in his limousine – and find that, in our imagination, no one shoots Kennedy.

It’s an interesting question whether counterfactuals can be true or false.  We never get to experience them directly.

Continue reading "Can Counterfactuals Be True?" »

GD Star Rating

When (Not) To Use Probabilities

Followup toShould We Ban Physics?

It may come as a surprise to some readers of this blog, that I do not always advocate using probabilities.

Or rather, I don’t always advocate that human beings, trying to solve their problems, should try to make up verbal probabilities, and then apply the laws of probability theory or decision theory to whatever number they just made up, and then use the result as their final belief or decision.

The laws of probability are laws, not suggestions, but often the true Law is too difficult for us humans to compute.  If P != NP and the universe has no source of exponential computing power, then there are evidential updates too difficult for even a superintelligence to compute – even though the probabilities would be quite well-defined, if we could afford to calculate them.

So sometimes you don’t apply probability theory.  Especially if you’re human, and your brain has evolved with all sorts of useful algorithms for uncertain reasoning, that don’t involve verbal probability assignments.

Not sure where a flying ball will land?  I don’t advise trying to formulate a probability distribution over its landing spots, performing deliberate Bayesian updates on your glances at the ball, and calculating the expected utility of all possible strings of motor instructions to your muscles.

Continue reading "When (Not) To Use Probabilities" »

GD Star Rating