<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: A game for self-Calibration?</title>
	<atom:link href="http://www.overcomingbias.com/2007/01/a_game_for_self.html/feed" rel="self" type="application/rss+xml" />
	<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html</link>
	<description>Overcoming Bias is economist Robin Hanson’s blog, on honesty, signaling, disagreement, forecasting, and the far future.</description>
	<lastBuildDate>Sat, 11 Feb 2012 23:23:58 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
	<item>
		<title>By: digital retrograde</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422992</link>
		<dc:creator>digital retrograde</dc:creator>
		<pubDate>Thu, 18 Jan 2007 03:55:06 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422992</guid>
		<description>&lt;strong&gt;The Idol Grinder&lt;/strong&gt;

Simon: Appalling. Randy: Dude, why are you here? Paula covers her ears. These American Idol contestants, you know the ones, have no concept of how much they can&#039;t sing. They&#039;re not self-critical or introspective enough. Of cours...
</description>
		<content:encoded><![CDATA[<p><strong>The Idol Grinder</strong></p>
<p>Simon: Appalling. Randy: Dude, why are you here? Paula covers her ears. These American Idol contestants, you know the ones, have no concept of how much they can&#8217;t sing. They&#8217;re not self-critical or introspective enough. Of cours&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robin Hanson</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422991</link>
		<dc:creator>Robin Hanson</dc:creator>
		<pubDate>Thu, 11 Jan 2007 23:25:12 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422991</guid>
		<description>Nick your game of predicting true story sequels is close to a proposal of mine.  I want to take 100 or so random sample people and describe them each in terms of 100 or so parameters, everything from height to income to how neat their room is.  I then want a web page where visitors would describe themselves in terms of these features, and then be tested on the sample people.  The test is this: for each sample person they had time to look at, they would be shown a random half of that person&#039;s features, and try to guess the other half of those features (ideally assigning probability distributions, even joint distributions).

Enough of this sort of data and we could figure out which features people use to infer which other features of people.  This would go a long way toward showing us the various signaling games we play.   But it would not help much to calibrate people&#039;s rationality under disagreement; for that we need games where people react to the estimates of others.
</description>
		<content:encoded><![CDATA[<p>Nick your game of predicting true story sequels is close to a proposal of mine.  I want to take 100 or so random sample people and describe them each in terms of 100 or so parameters, everything from height to income to how neat their room is.  I then want a web page where visitors would describe themselves in terms of these features, and then be tested on the sample people.  The test is this: for each sample person they had time to look at, they would be shown a random half of that person&#8217;s features, and try to guess the other half of those features (ideally assigning probability distributions, even joint distributions).</p>
<p>Enough of this sort of data and we could figure out which features people use to infer which other features of people.  This would go a long way toward showing us the various signaling games we play.   But it would not help much to calibrate people&#8217;s rationality under disagreement; for that we need games where people react to the estimates of others.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick Bostrom</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422990</link>
		<dc:creator>Nick Bostrom</dc:creator>
		<pubDate>Thu, 11 Jan 2007 22:32:13 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422990</guid>
		<description>Here&#039;s another game that could be constructed, although it would take more work. The ideal is to have estimation tasks that are similar to ones we face in ordinary life rather than trivia quizzes or mathematical puzzles.

The idea is to collect a number of case descriptions. A case description could be a one-pager describing the basic facts about the early stages of a relationship, the buisness plan of a start-up company, or the biography of a person up to a certain age. Participants would read the case descriptions and try to estimate various outcome measures - whether the start-up succeeded, how long the relationship lasted, what became of the individual 20 years later. The estimates would be compared to the actual outcome, which would have been recorded during the preparation of the case descriptions. Participans would be scored on both calibration and discrimination, given feedback, and the game would continue for many rounds with new case descriptions so that participants could improve over time. You could also have teams who would be allowed to discuss between themselves before issuing the team&#039;s estimates.

For it to work well, it would be important to select the cases randomly from the relevant sample population. If one sampled from 50% of successful and 50% unsuccessful companies, or only from people who had biographies written about them, one would reduce the value of the game. So unless one could think of some clever way of compiling relevant cases, it would take a significant effort to put toghether this kind of game for general life domains. On the other hand, such a game would seem to me to have great educational value.

It might be a good investment for an enlightened ministry of education somewhere to produce and promote such material. From a scientific point of view, it would also be intersting to study how much performance in such a game would correlate with G, experience, political views, and other factors. Perhaps it might also be useful for diagnosing some psychiatric problems.


</description>
		<content:encoded><![CDATA[<p>Here&#8217;s another game that could be constructed, although it would take more work. The ideal is to have estimation tasks that are similar to ones we face in ordinary life rather than trivia quizzes or mathematical puzzles.</p>
<p>The idea is to collect a number of case descriptions. A case description could be a one-pager describing the basic facts about the early stages of a relationship, the buisness plan of a start-up company, or the biography of a person up to a certain age. Participants would read the case descriptions and try to estimate various outcome measures &#8211; whether the start-up succeeded, how long the relationship lasted, what became of the individual 20 years later. The estimates would be compared to the actual outcome, which would have been recorded during the preparation of the case descriptions. Participans would be scored on both calibration and discrimination, given feedback, and the game would continue for many rounds with new case descriptions so that participants could improve over time. You could also have teams who would be allowed to discuss between themselves before issuing the team&#8217;s estimates.</p>
<p>For it to work well, it would be important to select the cases randomly from the relevant sample population. If one sampled from 50% of successful and 50% unsuccessful companies, or only from people who had biographies written about them, one would reduce the value of the game. So unless one could think of some clever way of compiling relevant cases, it would take a significant effort to put toghether this kind of game for general life domains. On the other hand, such a game would seem to me to have great educational value.</p>
<p>It might be a good investment for an enlightened ministry of education somewhere to produce and promote such material. From a scientific point of view, it would also be intersting to study how much performance in such a game would correlate with G, experience, political views, and other factors. Perhaps it might also be useful for diagnosing some psychiatric problems.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gordon Worley</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422989</link>
		<dc:creator>Gordon Worley</dc:creator>
		<pubDate>Thu, 11 Jan 2007 21:41:20 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422989</guid>
		<description>Here&#039;s my thought for a fun game to play on the computer.

First, create a source of numbers between 0 and 1 with an unknown distribution (unknown to the player, anyway).  For the game we will need to generate a sequence of numbers from this source.  Each number will represent a position along a line, i.e. think of a number as the relative distance from the left end of the line.  The game is played between two horizontal lines.  From the top line balls will drop and on the bottom line the player can position a catcher.  The size of the catcher varies in each round, but corresponds to a percentage of the line length, for example, 50% or 98%.  In each round, the player can place the catcher anywhere they want on the bottom line, so long as it stays within the line.  Game play is as follows:

Many balls are dropped from the top line and pile up, forming an approximate picture of the secret distribution.  The player is then allowed to place the catcher and the balls are cleared away and new ones following the same distribution drop again.  The player&#039;s goal is to place the catcher, not so that it maximizes ball&#039;s caught, but so that it catches the correct percentage of balls for its size.  For example, the 10% catcher should only catch 10% of the balls dropped over a sufficiently large number of ball drops (for the game, let&#039;s say 1000).  The score is based on how accurate the player was.  If the 10% catcher catches 50% of the balls dropped, the player did a poor job and gets a low score, whereas if the 98% catcher actually catches 97%, we&#039;d consider that pretty good (after all, the sample size is small enough for there to be some error).

While not exactly in the language of confidence intervals, it&#039;s similar to the task when picking a confidence interval, except that in real life we don&#039;t always get to see lots of sample data first.
</description>
		<content:encoded><![CDATA[<p>Here&#8217;s my thought for a fun game to play on the computer.</p>
<p>First, create a source of numbers between 0 and 1 with an unknown distribution (unknown to the player, anyway).  For the game we will need to generate a sequence of numbers from this source.  Each number will represent a position along a line, i.e. think of a number as the relative distance from the left end of the line.  The game is played between two horizontal lines.  From the top line balls will drop and on the bottom line the player can position a catcher.  The size of the catcher varies in each round, but corresponds to a percentage of the line length, for example, 50% or 98%.  In each round, the player can place the catcher anywhere they want on the bottom line, so long as it stays within the line.  Game play is as follows:</p>
<p>Many balls are dropped from the top line and pile up, forming an approximate picture of the secret distribution.  The player is then allowed to place the catcher and the balls are cleared away and new ones following the same distribution drop again.  The player&#8217;s goal is to place the catcher, not so that it maximizes ball&#8217;s caught, but so that it catches the correct percentage of balls for its size.  For example, the 10% catcher should only catch 10% of the balls dropped over a sufficiently large number of ball drops (for the game, let&#8217;s say 1000).  The score is based on how accurate the player was.  If the 10% catcher catches 50% of the balls dropped, the player did a poor job and gets a low score, whereas if the 98% catcher actually catches 97%, we&#8217;d consider that pretty good (after all, the sample size is small enough for there to be some error).</p>
<p>While not exactly in the language of confidence intervals, it&#8217;s similar to the task when picking a confidence interval, except that in real life we don&#8217;t always get to see lots of sample data first.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Carl Shulman</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422988</link>
		<dc:creator>Carl Shulman</dc:creator>
		<pubDate>Thu, 11 Jan 2007 02:32:16 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422988</guid>
		<description>Another structure:

1. Choose a quantitative estimation task based on a historical record, e.g given a set of baseball player statistics, predict the next season&#039;s win-loss ratio.
2. Participants individually select an interval of predetermined size, earning points if the true value is within that range. (Forcing the players to make a firm written estimate initially gives them a focus for confirmation bias and overconfidence.)
3. Participants are then allowed to see all the individual predictions and to make new predictions, scored separately. Repeat this some set number of times, watching for convergence or divergence, to make a round.
4. After a round, each player&#039;s cumulative score is made visible to all players, informing future rounds.
</description>
		<content:encoded><![CDATA[<p>Another structure:</p>
<p>1. Choose a quantitative estimation task based on a historical record, e.g given a set of baseball player statistics, predict the next season&#8217;s win-loss ratio.<br />
2. Participants individually select an interval of predetermined size, earning points if the true value is within that range. (Forcing the players to make a firm written estimate initially gives them a focus for confirmation bias and overconfidence.)<br />
3. Participants are then allowed to see all the individual predictions and to make new predictions, scored separately. Repeat this some set number of times, watching for convergence or divergence, to make a round.<br />
4. After a round, each player&#8217;s cumulative score is made visible to all players, informing future rounds.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eliezer Yudkowsky</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422987</link>
		<dc:creator>Eliezer Yudkowsky</dc:creator>
		<pubDate>Thu, 11 Jan 2007 00:09:01 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422987</guid>
		<description>It might be interesting to add a calibration game on top of other, existing games where there are plenty of emotionally laden questions with objective answers - for example, &quot;What&#039;s your 50% confidence interval on how much Monopoly money you think you&#039;ll have 4 turns from now?&quot;  Suppose that you, player A, can bet Monopoly money with player B, on how much Monopoly money C will have in 4 turns - bearing in mind that C is also making bets!  Then you must judge the calibration of others, and the game starts to look genuinely meta-rational.  (Note:  There must be a defined order of resolution for bets, to avoid circular dependencies.)

I agree that people might overestimate their general calibration from learning to play a calibration game, but it&#039;s better than nothing - you&#039;ve got to get started somewhere.
</description>
		<content:encoded><![CDATA[<p>It might be interesting to add a calibration game on top of other, existing games where there are plenty of emotionally laden questions with objective answers &#8211; for example, &#8220;What&#8217;s your 50% confidence interval on how much Monopoly money you think you&#8217;ll have 4 turns from now?&#8221;  Suppose that you, player A, can bet Monopoly money with player B, on how much Monopoly money C will have in 4 turns &#8211; bearing in mind that C is also making bets!  Then you must judge the calibration of others, and the game starts to look genuinely meta-rational.  (Note:  There must be a defined order of resolution for bets, to avoid circular dependencies.)</p>
<p>I agree that people might overestimate their general calibration from learning to play a calibration game, but it&#8217;s better than nothing &#8211; you&#8217;ve got to get started somewhere.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nick Bostrom</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422986</link>
		<dc:creator>Nick Bostrom</dc:creator>
		<pubDate>Wed, 10 Jan 2007 22:23:39 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422986</guid>
		<description>Robin: yes, there is a risk that some will be more accurate in the test than in other situations, and yet will extrapolate their test performance. And track records and prediction markets have important additional advantages. Here I was looking for a quick and simple way of getting at least some benefit to improve one&#039;s own calibration.

Michael A: yes, it&#039;s 2007, although I actually started making a few posts to this blog in late 2006...

Michael V: there&#039;ll probably be some significant G-loading for tests like this. It might be interesting to know how much, and whether it would depend on how the test was structured. Is there a factor of &quot;good judgment&quot; apart from G, which would reveal itself in guessing tasks that were (a) not logical/mathematical/verbal/spatial but instead ambiguous situations from ordinary life, and (b) not knowledge-intensive like trivial quizzes? If there is such a factor, part of it might be truth-seeking motivation. If we remove that, does any &quot;capacity to make good judgments when one tries&quot; factor remain? In other words, is WISDOM = G + KNOWLEDGE + DESIRE TO BE WISE, or are there additional components, such as meta-rationality or intuitive judgement/common sense? (I haven&#039;t looked for this in the literature - maybe somebody here knows?)

Bill: the almanac game is in the ballpark, but I&#039;d ideally like to avoid testing trivia knowledge.

One type of question that I believe is common in buisness job interviews is something like, &quot;How many lamp posts are there in Manhattan?&quot;. But this type of question also gets boring once one has mastered the general approach to solving them (estimate the number in a typical city blocks; estimate number of city blocks; multiply).



</description>
		<content:encoded><![CDATA[<p>Robin: yes, there is a risk that some will be more accurate in the test than in other situations, and yet will extrapolate their test performance. And track records and prediction markets have important additional advantages. Here I was looking for a quick and simple way of getting at least some benefit to improve one&#8217;s own calibration.</p>
<p>Michael A: yes, it&#8217;s 2007, although I actually started making a few posts to this blog in late 2006&#8230;</p>
<p>Michael V: there&#8217;ll probably be some significant G-loading for tests like this. It might be interesting to know how much, and whether it would depend on how the test was structured. Is there a factor of &#8220;good judgment&#8221; apart from G, which would reveal itself in guessing tasks that were (a) not logical/mathematical/verbal/spatial but instead ambiguous situations from ordinary life, and (b) not knowledge-intensive like trivial quizzes? If there is such a factor, part of it might be truth-seeking motivation. If we remove that, does any &#8220;capacity to make good judgments when one tries&#8221; factor remain? In other words, is WISDOM = G + KNOWLEDGE + DESIRE TO BE WISE, or are there additional components, such as meta-rationality or intuitive judgement/common sense? (I haven&#8217;t looked for this in the literature &#8211; maybe somebody here knows?)</p>
<p>Bill: the almanac game is in the ballpark, but I&#8217;d ideally like to avoid testing trivia knowledge.</p>
<p>One type of question that I believe is common in buisness job interviews is something like, &#8220;How many lamp posts are there in Manhattan?&#8221;. But this type of question also gets boring once one has mastered the general approach to solving them (estimate the number in a typical city blocks; estimate number of city blocks; multiply).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bill</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422985</link>
		<dc:creator>Bill</dc:creator>
		<pubDate>Wed, 10 Jan 2007 21:19:57 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422985</guid>
		<description>I see; you are right, the &quot;above or below&quot; sort of ruins the &quot;inside or outside&quot; with a 50% probability interval. I would rather the game have people think about &quot;equally likely&quot; rather than &quot;equally attractive&quot;, so I suppose I would only recommend the original version.

Thanks for the comment; I&#039;m glad I learned about this.
</description>
		<content:encoded><![CDATA[<p>I see; you are right, the &#8220;above or below&#8221; sort of ruins the &#8220;inside or outside&#8221; with a 50% probability interval. I would rather the game have people think about &#8220;equally likely&#8221; rather than &#8220;equally attractive&#8221;, so I suppose I would only recommend the original version.</p>
<p>Thanks for the comment; I&#8217;m glad I learned about this.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter de Blanc</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422984</link>
		<dc:creator>Peter de Blanc</dc:creator>
		<pubDate>Wed, 10 Jan 2007 21:12:37 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422984</guid>
		<description>60% was chosen so that if C has the same information as B, then inside and outside will be equally attractive.

If C guesses &quot;inside,&quot; then the expected gain is .6*1 point = 0.6 points.

If C guesses &quot;outside,&quot; then the expected gain is .4*(.5*1 point + .5*2 points) = 0.6 points.

I&#039;m assuming C doesn&#039;t care if B gains or loses points (rationality is not a zero-sum game).
</description>
		<content:encoded><![CDATA[<p>60% was chosen so that if C has the same information as B, then inside and outside will be equally attractive.</p>
<p>If C guesses &#8220;inside,&#8221; then the expected gain is .6*1 point = 0.6 points.</p>
<p>If C guesses &#8220;outside,&#8221; then the expected gain is .4*(.5*1 point + .5*2 points) = 0.6 points.</p>
<p>I&#8217;m assuming C doesn&#8217;t care if B gains or loses points (rationality is not a zero-sum game).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bill</title>
		<link>http://www.overcomingbias.com/2007/01/a_game_for_self.html#comment-422983</link>
		<dc:creator>Bill</dc:creator>
		<pubDate>Wed, 10 Jan 2007 21:02:06 +0000</pubDate>
		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/a-game-for-self-calibration.html#comment-422983</guid>
		<description>Peter said:

&gt;&gt; (An alternate version allows C to win another point if, when the answer is outside, C guesses &quot;Above&quot; or &quot;Below&quot; correctly)

&gt;In that case, you should use a 60% confidence interval.

How come? I don&#039;t follow your reasoning.

If it were a 60% probability interval, then C would always pick inside, no?

Is it to make the game fair between B and C, since in this version, C can win two points while B can only win one?
</description>
		<content:encoded><![CDATA[<p>Peter said:</p>
<p>>> (An alternate version allows C to win another point if, when the answer is outside, C guesses &#8220;Above&#8221; or &#8220;Below&#8221; correctly)</p>
<p>>In that case, you should use a 60% confidence interval.</p>
<p>How come? I don&#8217;t follow your reasoning.</p>
<p>If it were a 60% probability interval, then C would always pick inside, no?</p>
<p>Is it to make the game fair between B and C, since in this version, C can win two points while B can only win one?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk (enhanced)
Database Caching using disk
Object Caching 438/455 objects using disk
Content Delivery Network via Amazon Web Services: S3: overcomingbias-assets.s3.amazonaws.com

Served from: www.overcomingbias.com @ 2012-02-11 18:49:08 -->
