November 26, 2008

Beliefs Require Reasons, or: Is the Pope Catholic? Should he be?

In the early days of this blog, I would pick fierce arguments with Robin about the no-disagreement hypothesis.  Lately, however, reflection on things like public reason have brought me toward agreement with Robin, or at least moderated my disagreement.  To see why, it's perhaps useful to take a look at the newspapers...

the pope said the book “explained with great clarity” that “an interreligious dialogue in the strict sense of the word is not possible.” In theological terms, added the pope, “a true dialogue is not possible without putting one’s faith in parentheses.”

What are we to make of a statement like this?

Continue reading "Beliefs Require Reasons, or: Is the Pope Catholic? Should he be? " »

November 11, 2008

Worse Than Random

Previously in seriesLawful Uncertainty

You may have noticed a certain trend in recent posts:  I've been arguing that randomness hath no power, that there is no beauty in entropy, nor yet strength from noise.

If one were to formalize the argument, it would probably run something like this: that if you define optimization as previously suggested, then sheer randomness will generate something that seems to have 12 bits of optimization, only by trying 4096 times; or 100 bits of optimization, only by trying 1030 times.

This may not sound like a profound insight, since it is true by definition.  But consider - how many comic books talk about "mutation" as if it were a source of power?  Mutation is random.  It's the selection part, not the mutation part, that explains the trends of evolution.

Or you may have heard people talking about "emergence" as if it could explain complex, functional orders.  People will say that the function of an ant colony emerges - as if, starting from ants that had been selected only to function as solitary individuals, the ants got together in a group for the first time and the ant colony popped right out.  But ant colonies have been selected on as colonies by evolution.  Optimization didn't just magically happen when the ants came together.

And you may have heard that certain algorithms in Artificial Intelligence work better when we inject randomness into them.

Is that even possible?  How can you extract useful work from entropy?

But it is possible in theory, since you can have things that are anti-optimized.  Say, the average state has utility -10, but the current state has an unusually low utility of -100.  So in this case, a random jump has an expected benefit.  If you happen to be standing in the middle of a lava pit, running around at random is better than staying in the same place.  (Not best, but better.)

A given AI algorithm can do better when randomness is injected, provided that some step of the unrandomized algorithm is doing worse than random.

Continue reading "Worse Than Random" »

November 10, 2008

Lawful Uncertainty

Previously in seriesLawful Creativity

From Robyn Dawes, Rational Choice in an Uncertain World:

"Many psychological experiments were conducted in the late 1950s and early 1960s in which subjects were asked to predict the outcome of an event that had a random component but yet had base-rate predictability - for example, subjects were asked to predict whether the next card the experiment turned over would be red or blue in a context in which 70% of the cards were blue, but in which the sequence of red and blue cards was totally random.

In such a situation, the strategy that will yield the highest proportion of success is to predict the more common event.  For example, if 70% of the cards are blue, then predicting blue on every trial yields a 70% success rate.

What subjects tended to do instead, however, was match probabilities - that is, predict the more probable event with the relative frequency with which it occurred.  For example, subjects tended to predict 70% of the time that the blue card would occur and 30% of the time that the red card would occur.  Such a strategy yields a 58% success rate, because the subjects are correct 70% of the time when the blue card occurs (which happens with probability .70) and 30% of the time when the red card occurs (which happens with probability .30); .70 * .70 + .30 * .30 = .58.

In fact, subjects predict the more frequent event with a slightly higher probability than that with which it occurs, but do not come close to predicting its occurrence 100% of the time, even when they are paid for the accuracy of their predictions...  For example, subjects who were paid a nickel for each correct prediction over a thousand trials... predicted [the more common event] 76% of the time."

(Dawes cites:  Tversky, A. and Edwards, W.  1966.  Information versus reward in binary choice.  Journal of Experimental Psychology, 71, 680-683.)

Do not think that this experiment is about a minor flaw in gambling strategies.  It compactly illustrates the most important idea in all of rationality.

Continue reading "Lawful Uncertainty" »

November 07, 2008

Recognizing Intelligence

Previously in seriesBuilding Something Smarter

Humans in Funny Suits inveighed against the problem of "aliens" on TV shows and movies who think and act like 21st-century middle-class Westerners, even if they have tentacles or exoskeletons.  If you were going to seriously ask what real aliens might be like, you would try to make fewer assumptions - a difficult task when the assumptions are invisible.

I previously spoke of how you don't have to start out by assuming any particular goals, when dealing with an unknown intelligence.  You can use some of your evidence to deduce the alien's goals, and then use that hypothesis to predict the alien's future achievements, thus making an epistemological profit.

But could you, in principle, recognize an alien intelligence without even hypothesizing anything about its ultimate ends - anything about the terminal values it's trying to achieve?

This sounds like it goes against my suggested definition of intelligence, or even optimization.  How can you recognize something as having a strong ability to hit narrow targets in a large search space, if you have no idea what the target is?

And yet, intuitively, it seems easy to imagine a scenario in which we could recognize an alien's intelligence while having no concept whatsoever of its terminal values - having no idea where it's trying to steer the future.

Continue reading "Recognizing Intelligence" »

November 03, 2008

Complexity and Intelligence

Followup toBuilding Something Smarter , Say Not "Complexity", That Alien Message

One of the Godel-inspired challenges to the idea of self-improving minds is based on the notion of "complexity".

Now "complexity", as I've previously mentioned, is a dangerous sort of word.  "Complexity" conjures up the image of a huge machine with incomprehensibly many gears inside - an impressive sort of image.  Thanks to this impressiveness, "complexity" sounds like it could be explaining all sorts of things - that all sorts of phenomena could be happening because of "complexity".

It so happens that "complexity" also names another meaning, strict and mathematical: the Kolmogorov complexity of a pattern is the size of the program code of the shortest Turing machine that produces the pattern as an output, given unlimited tape as working memory.

I immediately note that this mathematical meaning, is not the same as that intuitive image that comes to mind when you say "complexity".  The vast impressive-looking collection of wheels and gears?  That's not what the math term means.

Suppose you ran a Turing machine with unlimited tape, so that, starting from our laws of physics, it simulated our whole universe - not just the region of space we see around us, but all regions of space and all quantum branches.  (There's strong indications our universe may be effectively discrete, but if not, just calculate it out to 3^^^3 digits of precision.)

Then the "Kolmogorov complexity" of that entire universe - throughout all of space and all of time, from the Big Bang to whatever end, and all the life forms that ever evolved on Earth and all the decoherent branches of Earth and all the life-bearing planets anywhere, and all the intelligences that ever devised galactic civilizations, and all the art and all the technology and every machine ever built by those civilizations...

...would be 500 bits, or whatever the size of the true laws of physics when written out as equations on a sheet of paper.

The Kolmogorov complexity of just a single planet, like Earth, would of course be much higher than the "complexity" of the entire universe that contains it.

Continue reading "Complexity and Intelligence" »

October 29, 2008

Economic Definition of Intelligence?

Followup toEfficient Cross-Domain Optimization

Shane Legg once produced a catalogue of 71 definitions of intelligence.  Looking it over, you'll find that the 18 definitions in dictionaries and the 35 definitions of psychologists are mere black boxes containing human parts.

However, among the 18 definitions from AI researchers, you can find such notions as

"Intelligence measures an agent's ability to achieve goals in a wide range of environments" (Legg and Hutter)

or

"Intelligence is the ability to optimally use limited resources - including time - to achieve goals" (Kurzweil)

or even

"Intelligence is the power to rapidly find an adequate solution in what appears a priori (to observers) to be an immense search space" (Lenat and Feigenbaum)

which is about as close as you can get to my own notion of "efficient cross-domain optimization" without actually measuring optimization power in bits.

But Robin Hanson, whose AI background we're going to ignore for a moment in favor of his better-known identity as an economist, at once said:

"I think what you want is to think in terms of a production function, which describes a system's output on a particular task as a function of its various inputs and features."

Economists spend a fair amount of their time measuring things like productivity and efficiency.  Might they have something to say about how to measure intelligence in generalized cognitive systems?

This is a real question, open to all economists.  So I'm going to quickly go over some of the criteria-of-a-good-definition that stand behind my own proffered suggestion on intelligence, and what I see as the important challenges to a productivity-based view.  It seems to me that this is an important sub-issue of Robin's and my persistent disagreement about the Singularity.

Continue reading "Economic Definition of Intelligence?" »

October 27, 2008

Measuring Optimization Power

Previously in seriesAiming at the Target

Yesterday I spoke of how "When I think you're a powerful intelligence, and I think I know something about your preferences, then I'll predict that you'll steer reality into regions that are higher in your preference ordering."

You can quantify this, at least in theory, supposing you have (A) the agent or optimization process's preference ordering, and (B) a measure of the space of outcomes - which, for discrete outcomes in a finite space of possibilities, could just consist of counting them - then you can quantify how small a target is being hit, within how large a greater region.

Then we count the total number of states with equal or greater rank in the preference ordering to the outcome achieved, or integrate over the measure of states with equal or greater rank.  Dividing this by the total size of the space gives you the relative smallness of the target - did you hit an outcome that was one in a million?  One in a trillion?

Actually, most optimization processes produce "surprises" that are exponentially more improbable than this - you'd need to try far more than a trillion random reorderings of the letters in a book, to produce a play of quality equalling or exceeding Shakespeare.  So we take the log base two of the reciprocal of the improbability, and that gives us optimization power in bits.

This figure - roughly, the improbability of an "equally preferred" outcome being produced by a random selection from the space (or measure on the space) - forms the foundation of my Bayesian view of intelligence, or to be precise, optimization power.  It has many subtleties:

Continue reading "Measuring Optimization Power" »

October 26, 2008

Aiming at the Target

Previously in seriesBelief in Intelligence

Previously, I spoke of that very strange epistemic position one can occupy, wherein you don't know exactly where Kasparov will move on the chessboard, and yet your state of knowledge about the game is very different than if you faced a random move-generator with the same subjective probability distribution - in particular, you expect Kasparov to win.  I have beliefs about where Kasparov wants to steer the future, and beliefs about his power to do so.

Well, and how do I describe this knowledge, exactly?

In the case of chess, there's a simple function that classifies chess positions into wins for black, wins for white, and drawn games.  If I know which side Kasparov is playing, I know the class of chess positions Kasparov is aiming for.  (If I don't know which side Kasparov is playing, I can't predict whether black or white will win - which is not the same as confidently predicting a drawn game.)

More generally, I can describe motivations using a preference ordering. When I consider two potential outcomes, X and Y, I can say that I prefer X to Y; prefer Y to X; or find myself indifferent between them. I would write these relations as X > Y; X < Y; and X ~ Y.

Suppose that you have the ordering A < B ~ C < D ~ E. Then you like B more than A, and C more than A.  {B, C}, belonging to the same class, seem equally desirable to you; you are indifferent between which of {B, C} you receive, though you would rather have either than A, and you would rather have something from the class {D, E} than {B, C}.

When I think you're a powerful intelligence, and I think I know something about your preferences, then I'll predict that you'll steer reality into regions that are higher in your preference ordering.

Continue reading "Aiming at the Target" »

October 24, 2008

Expected Creative Surprises

Imagine that I'm playing chess against a smarter opponent.  If I could predict exactly where my opponent would move on each turn, I would automatically be at least as good a chess player as my opponent.  I could just ask myself where my opponent would move, if they were in my shoes; and then make the same move myself.  (In fact, to predict my opponent's exact moves, I would need to be superhuman - I would need to predict my opponent's exact mental processes, including their limitations and their errors.  It would become a problem of psychology, rather than chess.)

So predicting an exact move is not possible, but neither is it true that I have no information about my opponent's moves.

Personally, I am a very weak chess player - I play an average of maybe two games per year.  But even if I'm playing against former world champion Garry Kasparov, there are certain things I can predict about his next move.  When the game starts, I can guess that the move P-K4 is more likely than P-KN4.  I can guess that if Kasparov has a move which would allow me to checkmate him on my next move, that Kasparov will not make that move.

Much less reliably, I can guess that Kasparov will not make a move that exposes his queen to my capture - but here, I could be greatly surprised; there could be a rationale for a queen sacrifice which I have not seen.

And finally, of course, I can guess that Kasparov will win the game...

Continue reading "Expected Creative Surprises" »

October 05, 2008

My Bayesian Enlightenment

Followup toThe Magnitude of His Own Folly

I remember (dimly, as human memories go) the first time I self-identified as a "Bayesian".  Someone had just asked a malformed version of an old probability puzzle, saying:

If I meet a mathematician on the street, and she says, "I have two children, and at least one of them is a boy," what is the probability that they are both boys?

In the correct version of this story, the mathematician says "I have two children", and you ask, "Is at least one a boy?", and she answers "Yes".  Then the probability is 1/3 that they are both boys.

But in the malformed version of the story - as I pointed out - one would common-sensically reason:

If the mathematician has one boy and one girl, then my prior probability for her saying 'at least one of them is a boy' is 1/2 and my prior probability for her saying 'at least one of them is a girl' is 1/2.  There's no reason to believe, a priori, that the mathematician will only mention a girl if there is no possible alternative.

So I pointed this out, and worked the answer using Bayes's Rule, arriving at a probability of 1/2 that the children were both boys.  I'm not sure whether or not I knew, at this point, that Bayes's rule was called that, but it's what I used.

And lo, someone said to me, "Well, what you just gave is the Bayesian answer, but in orthodox statistics the answer is 1/3.  We just exclude the possibilities that are ruled out, and count the ones that are left, without trying to guess the probability that the mathematician will say this or that, since we have no way of really knowing that probability - it's too subjective."

I responded - note that this was completely spontaneous - "What on Earth do you mean?  You can't avoid assigning a probability to the mathematician making one statement or another.  You're just assuming the probability is 1, and that's unjustified."

To which the one replied, "Yes, that's what the Bayesians say.  But frequentists don't believe that."

And I said, astounded: "How can there possibly be such a thing as non-Bayesian statistics?"

Continue reading "My Bayesian Enlightenment" »

Search

December 2008

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31