<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Overcoming Bias &#187; Statistics</title>
	<atom:link href="http://www.overcomingbias.com/tag/statistics/feed" rel="self" type="application/rss+xml" />
	<link>http://www.overcomingbias.com</link>
	<description>Overcoming Bias is economist Robin Hanson’s blog, on honesty, signaling, disagreement, forecasting, and the far future.</description>
	<lastBuildDate>Fri, 10 Feb 2012 15:35:48 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Peer Review Is Random</title>
		<link>http://www.overcomingbias.com/2010/12/peer-review-is-random.html</link>
		<comments>http://www.overcomingbias.com/2010/12/peer-review-is-random.html#comments</comments>
		<pubDate>Tue, 21 Dec 2010 01:00:20 +0000</pubDate>
		<dc:creator>Robin Hanson</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Academia]]></category>
		<category><![CDATA[Project]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.overcomingbias.com/?p=25193</guid>
		<description><![CDATA[Which academic articles get published in the more prestigious journals is a pretty random process. When referees review an academic paper, less than 20% of the variability in referee ratings is explained by a tendency to agree: This paper presents &#8230; <a href="http://www.overcomingbias.com/2010/12/peer-review-is-random.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Which academic articles get published in the more prestigious journals is a pretty random process.  When referees review an academic paper, <strong>less than 20% of the variability in referee ratings is explained by a tendency to agree</strong>:</p>
<p style="padding-left: 30px;">This paper presents the first meta-analysis for the inter-rater reliability (IRR) of journal peer reviews [using] &#8230; 70 reliability coefficients &#8230; from 48 studies. &#8230; [covering] 19,443 manuscripts; on average, each study had a sample size of 311 manuscripts (minimum: 28, maximum: 1983). &#8230; The more manuscripts that a study is based on, the smaller the reported IRR coefficients are. .. If the information of the rating system for reviewers was reported in a study, then this was associated with a smaller IRR coefficient. &#8230; An ICC of .23 indicates that only 23% of the variability in the reviewers’ rating of a manuscript could be explained by the agreement of reviewers. (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0014331">more</a>: HT <a href="http://www.marginalrevolution.com/marginalrevolution/2010/12/assorted-links-17.html">Tyler</a>)</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-25194" title="reviewreliability" src="http://overcomingbias-assets.s3.amazonaws.com/wp-content/uploads/2010/12/reviewreliability.gif" alt="reviewreliability" width="307" height="434" /></p>
<p>The above is from their key figure, showing reliability estimates and confidence intervals for studies ordered by estimated reliability. The most accurate studies found the lowest reliabilities, clear evidence of a bias toward publishing studies that find high reliability.  I recommend trusting only the most solid studies, which give the most pessimistic (&lt;20%) estimates.</p>
<p>Seems a model would be useful here. Model the optimal number of referees per paper, given referee reliability, the value of identifying the best papers, and the relative cost of writing vs. refereeing a paper. Such a model could estimate losses from having many journals with separate referees evaluate the each article, vs. an integrated system.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2010/12/peer-review-is-random.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>How Exceptional Is Gelman?</title>
		<link>http://www.overcomingbias.com/2009/07/how-exceptional-is-gelman.html</link>
		<comments>http://www.overcomingbias.com/2009/07/how-exceptional-is-gelman.html#comments</comments>
		<pubDate>Thu, 16 Jul 2009 16:00:13 +0000</pubDate>
		<dc:creator>Robin Hanson</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Academia]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.overcomingbias.com/?p=19024</guid>
		<description><![CDATA[In response to my saying: Academia is primarily an institution for credentialling folks as intellectually impressive, so that others can affiliate with them. Andrew Gelman penned &#8220;Another reason I&#8217;m glad I&#8217;m not an economist&#8220;: That [Robin] would write such an &#8230; <a href="http://www.overcomingbias.com/2009/07/how-exceptional-is-gelman.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In response to <a href="http://www.overcomingbias.com/2009/06/why-signals-are-shallow.html">my saying</a>:</p>
<p style="padding-left: 30px;">Academia is primarily an institution for credentialling folks as intellectually impressive, so that others can affiliate with them.</p>
<p>Andrew Gelman penned &#8220;<a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2009/06/another_reason.html">Another reason I&#8217;m glad I&#8217;m not an economist</a>&#8220;:</p>
<p style="padding-left: 30px;">That [Robin] would write such an extreme statement without even feeling the need to justify it (and, no, I don&#8217;t think it&#8217;s true, at least not in the &#8220;academia&#8221; that I know about) . . . that I see as a product of being in an economics department.</p>
<p>I <a href="http://www.overcomingbias.com/2009/07/academias-function.html">responded</a>:</p>
<p style="padding-left: 30px;">I have posted many times here on [this]. &#8230; The standard idealistic [story] is that academics know useful and important things, things which students want to learn, media want to report, consulting clients want to apply, &#8230; These idealistic theories &#8230; have [these listed] detailed problems. &#8230; It seems far simpler to me to just postulate that people care primarily about affiliating with others who have been certified as prestigious.</p>
<p>Andrew <a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2009/07/elaboration_on.html">answered</a>:<span id="more-19024"></span></p>
<p style="padding-left: 60px;">College students prefer to be taught by profs who research, and hence ignore students more, yet students have little idea what their profs research. . . . There is relatively little relation between what profs teach, what profs research, and what students do after they graduate.</p>
<p style="padding-left: 30px;">To which I reply: No way, dude! Our students &#8230; send me emails asking when I&#8217;m going to teach multilevel models and Bayesian statistics. &#8230; There is a strong connection between what I teach and what I research. And it&#8217;s my impression that they do use this stuff after they graduate. &#8230;</p>
<p style="padding-left: 30px;">You might say: Fine, but Robin is talking about academia in general, not the Columbia statistics department in general. That I could buy &#8230; But . . . in his blog entry Robin appears to be skeptical of my claim that the customers who pay my salary &#8220;learn how to fit multilevel models.&#8221;</p>
<p>On why he is funded:</p>
<p style="padding-left: 30px;">The state government of New York or the Heritage Foundation or whatever, &#8230; I assume they would like their conclusions to be research-based, to avoid negative unintended consequences and all the other things that we worry about when considering policies.</p>
<p>My primary focus is academia in general, and grad students are not &#8220;college students.&#8221;  I have consistently told both college and grad students that stat classes are among the most useful later in a non-academic career.   So I am happy to grant that Andrew may be an unusual exception.  Nevertheless, consider:</p>
<ol>
<li>Since I&#8217;ve granted that my story is contrary to what people usually say and assume, saying &#8220;I assume&#8221; on funding just isn&#8217;t much of a contrary argument.  &#8220;My impression&#8221; on students isn&#8217;t much better.</li>
<li>Your funding patrons may like to see studies using your methods not because they predict better but because they and you are more prestigious.  Could you tell the difference?</li>
<li>I suspect most of your students never much use the methods they learn from you later.  Some no doubt do use them.</li>
<li>Employers may want to hire your students to use your methods not because those methods predict better because they are more prestigious, and people who can master them are just better overall.</li>
<li>Last week I <a href="http://www.overcomingbias.com/2009/07/simple-forecast-models-best.html">mentioned</a> that fancy stat forecasts are consistently beat by simple moving averages; have you done field tests to see how well your students actually do using your methods, compared with simpler methods?</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2009/07/how-exceptional-is-gelman.html/feed</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Why so little model checking done in statistics?</title>
		<link>http://www.overcomingbias.com/2007/09/one-thing-that.html</link>
		<comments>http://www.overcomingbias.com/2007/09/one-thing-that.html#comments</comments>
		<pubDate>Sat, 22 Sep 2007 14:00:00 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/09/why-so-little-model-checking-done-in-statistics.html</guid>
		<description><![CDATA[
 <a href="http://www.overcomingbias.com/2007/09/one-thing-that.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>One thing that bugs me is that there seems to be so little model checking done in statistics.&nbsp; Data-based model checking is a powerful tool for overcoming bias, and it&#8217;s frustrating to see this tool used so rarely.&nbsp; As I wrote <a href="http://www.stat.columbia.edu/%7Ecook/movabletype/archives/2007/02/heres_a_paragra.html">in this referee report</a>,</p>
<blockquote><p> I&#8217;d like to see some graphs of the raw data, along with replicated datasets from the model. The paper admirably connects the underlying problem to the statistical model; however, the Bayesian approach requires a lot of modeling assumptions, and I&#8217;d be a lot more convinced if I could (a) see some of the data and (b) see that the fitted model would produce simulations that look somewhat like the actual data. Otherwise we&#8217;re taking it all on faith. </p>
</blockquote>
<p>But, why, if this is such a good idea, do people not do it?&nbsp; </p>
<p>  <span id="more-17796"></span>
<p>I don&#8217;t buy the cynical answer that people don&#8217;t want to falsify their own models. My preferred explanation might be called sociological and goes as follows: We&#8217;re often told to check model fit. But suppose we fit a model, write a paper, and check the model fit with a graph. If the fit is ok, then why bother with the graph: the model is OK, right? If the fit shows problems (which, realistically, it should, if you think hard enough about how to make your model-checking graph), then you better not include the graph in the paper, or the reviewers will reject, saying that you should fix your model. And once you&#8217;ve fit the better model, no need for the graph.  </p>
<p>The result is: (a) a bloodless view of statistics in which only the good models appear, leaving readers in the dark about all the steps needed to get there; or, worse, (b) statisticians (and, in general, researchers) not checking the fit of their model in the first place, so that neither the original researchers nor the readers of the journal learn about the problems with the model.</p>
<p><strong>One more thing . . .</strong></p>
<p>You might say that there&#8217;s no reason to bother with model checking since all models are false anyway. I do believe that all models are false, but for me the purpose of model checking is not to accept or reject a model, but to reveal aspects of the data that are not captured by the fitted model. (See chapter 6 of Bayesian Data Analysis for some examples.)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2007/09/one-thing-that.html/feed</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>How should unproven findings be publicized?</title>
		<link>http://www.overcomingbias.com/2007/07/how-should-unpr.html</link>
		<comments>http://www.overcomingbias.com/2007/07/how-should-unpr.html#comments</comments>
		<pubDate>Tue, 10 Jul 2007 01:56:51 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Gender]]></category>
		<category><![CDATA[Overconfidence]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/07/how-should-unproven-findings-be-publicized.html</guid>
		<description><![CDATA[
 <a href="http://www.overcomingbias.com/2007/07/how-should-unpr.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A year or so ago I heard about a couple of papers by Satoshi Kanazawa on &quot;Engineers have more sons, nurses have more daughters&quot; and &quot;Beautiful parents have more daughters.&quot;&nbsp; The titles surprised me, because in my acquaintance with such data, I&#8217;d seen very little evidence of sex ratios at birth varying much at all, certainly not by 26% as was claimed in one of these papers.&nbsp; I looked into it and indeed it turned out that the findings could be explained as statistical artifacts&#8211;the key errors were, in one of the studies, <a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2006/04/amusing_example.html">controlling for intermediate outcomes</a> and, in the other study, <a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2006/08/more_on_girl_an.html">reporting only one of multiple potential hypothesis tests</a>.&nbsp; At the time, I felt that a key weakness of the research was that it did not include collaboration with statisticians, experimental psychologists, or others who are aware of these issues.</p>
<p>  <span id="more-17966"></span>
<p>I did my duty and <a href="http://www.stat.columbia.edu/~gelman/research/published/kanazawa.pdf">wrote a letter</a> which was published in the Journal of Theoretical Biology.&nbsp; (I also emailed Kanazawa a copy but didn&#8217;t hear back from him.)&nbsp; There things stood until yesterday when I saw in Tyler Cowen&#8217;s blog that Kanazawa had written <a href="http://www.psychologytoday.com/articles/pto-20070622-000002.xml">an article in Psychology Today</a> repeating the claim, &quot;Americans who are rated &quot;very attractive&quot; have a 56 percent chance of having a daughter for their first child, compared with 48 percent for everyone else.&quot;&nbsp; And, even more amazingly (to me), Kanazawa is publishing a book called &quot;Why Beautiful People Have More Daughters.&quot;&nbsp; The work has also been publicized in various places, including a positive mention by Stephen Dubner <a href="http://www.freakonomics.com/blog/2006/08/02/why-do-beautiful-women-sometimes-marry-unattractive-men/">here</a> (and a more mocking mention <a href="http://www.freakonomics.com/blog/2007/07/09/the-science-of-large-breasts-and-other-evolutionary-verities/">here</a>).</p>
<p>OK, now to get to my question.&nbsp; Kanazawa&#8217;s conjectures have not been demonstrated statistically.&nbsp; (For example, the claim about beautiful parents having more daughters was barely statistically significant and was one of many possible comparisons that could&#8217;ve been done with those data.)&nbsp; So it&#8217;s a little disturbing to see this as presented as &quot;true, supported by documented scientific evidence.&quot;&nbsp; On the other hand, their claim might be true.&nbsp; It would be more scientifically appropriate for Kanazawa to present these results as &quot;speculations which are supported by data,&quot; but maybe Psychology Today expects a different sort of writing?</p>
<p>I just don&#8217;t know how to think about this.&nbsp; It&#8217;s clear to me how journalists, bloggers, and reviewers should react:&nbsp; the should discuss this work with skepticism.&nbsp; The trouble is that the papers were published in a reputable journal (J. Theor. Biology), and a journalist/blogger/reviewer who does not happen to see my critique would naturally tend to trust the result.&nbsp; (I was only tipped off because I&#8217;d already read a bit in the area of sex ratios, for no other reason than that I&#8217;ve used boy and girl births as a teaching example.&nbsp; This was essentially a bit of Bayesian reasoning by me, that Kanazawa conclusions didn&#8217;t match my priors, leading me to look more carefully at his reasoning.)&nbsp; So I can&#8217;t really blame the editors of Psychology Today, or maybe even the editors at Perigee Books for not knowing any better.</p>
<p>But should I blame Kanazawa?&nbsp; <a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2006/07/is_this_blog_to.html">I don&#8217;t want to be dismissive of scientific speculation</a>&#8211;I don&#8217;t like the idea of statistican as censor&#8211;so maybe there would be a way for him to present more of the full statistical story in his book (for example, in the beauty-and-daughters study, a graph with the proportion of girls born to people of all five beauty categories&#8211;rather than just comparing categories 1-4 to category 5&#8211;along with the beauty assessments from all three waves of the study).&nbsp; It&#8217;s a tough call to decide how to present speculative findings.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2007/07/how-should-unpr.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Statistical inefficiency = bias, or, Increasing efficiency will reduce bias (on average), or, There is no bias-variance tradeoff</title>
		<link>http://www.overcomingbias.com/2007/04/statistical_ine.html</link>
		<comments>http://www.overcomingbias.com/2007/04/statistical_ine.html#comments</comments>
		<pubDate>Sat, 14 Apr 2007 00:52:50 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/04/statistical-inefficiency-bias-or-increasing-efficiency-will-reduce-bias-on-average-or-there-is-no-bias-variance-tradeoff.html</guid>
		<description><![CDATA[
 <a href="http://www.overcomingbias.com/2007/04/statistical_ine.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Statisticians often talk about a bias-variance tradeoff, comparing a simple unbiased estimator (for example, a <a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2007/02/differenceindif.html">difference in differences</a>) to something more efficient but possibly biased (for example, a regression).&nbsp; There&#8217;s commonly the attitude that the unbiased estimate is a better or safer choice.&nbsp; My only point here is that, by using a less efficient estimate, we are generally choosing to estimate fewer parameters (for example, estimating an average incumbency effect over a 40-year period rather than estimating a separate effect for each year or each decade).&nbsp; Or estimating an overall effect of a treatment rather than separate estimates for men and women.&nbsp; If we do this&#8211;make the seemingly conservative choice to not estimate interactions, we are implicitly estimating these interactions at zero, which is not unbiased at all!</p>
<p>I&#8217;m not saying that there are any easy answers to this; for example, <a href="http://www.stat.columbia.edu/~gelman/research/published/jbes01m045r3.pdf">see here</a> for one of my struggles with interactions in an applied problem&#8212;in this case (estimating the effect of incentives in sample surveys), we were particularly interested in certain interactions even thought they could not be estimated precisely from data.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2007/04/statistical_ine.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Useful bias</title>
		<link>http://www.overcomingbias.com/2007/03/useful_bias.html</link>
		<comments>http://www.overcomingbias.com/2007/03/useful_bias.html#comments</comments>
		<pubDate>Sun, 25 Mar 2007 16:08:00 +0000</pubDate>
		<dc:creator>Adrian Tschoegl</dc:creator>
				<category><![CDATA[Overconfidence]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[War]]></category>

		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/03/useful-bias.html</guid>
		<description><![CDATA[
 <a href="http://www.overcomingbias.com/2007/03/useful_bias.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I would like to introduce the perhaps, in this forum, heretical notion of useful bias.&nbsp; By useful bias I mean the deliberate introduction of an error as a means to solving a problem.&nbsp; The two examples I discuss below are concrete rather than abstract and come from my training as an infantry officer many years ago.&nbsp; Now technology solves the problems they solved, but the examples may still serve to illustrate the notion. </p>
<p>  <span id="more-18144"></span>
<p>The first example comes from land navigation, which is the use of compass and map to get from one point to another.&nbsp; One standard problem is to get from a point in a wood, or other occluded terrain, to a point on a road or the like, some distance away.&nbsp; The unbiased approach is to take a bearing, i.e., determine a direction, from where one is to where one wants to go, and then follow it.&nbsp; The problem is that as one follows the bearing, with each step a little random lateral error creeps in so that when one reaches the road one may not be sure whether the point one is seeking is to the right or the left.&nbsp; The biased approach is to follow a bearing that is sufficiently to the left or right of the objective that when one reaches the road one can assume with a high degree of probability that the objective is to the right or left.  </p>
<p>The second example comes from directing artillery fire to strike a target that one can observe, but that is an unknown distance away.&nbsp; The unbiased approach is to estimate (guess) the distance, notify the gunners, observe the first shot, and then walk subsequent shots towards the target in increments of distance.&nbsp; (Up 100. Up 100. etc.) The biased approach is “bracketing” the target.&nbsp; In bracketing, the observer estimates (guesses) the distance, and then adds a large increment to the estimate to ensure that the first shot will fall beyond the target.&nbsp; The observer then adjusts the fall of the sequence of subsequent shots by halving the distance between subsequent shots.&nbsp; (ideally, by cycling through a sequence of over and under shots.&nbsp; As n increases, X plus (0.5) to the nth power, times β sub n, where β is the unknown bias in estimating the unknown range X, will converge on X.&nbsp; Experiments have shown that on average, bracketing will result in a faster convergence of the fire on the target than will walking.  </p>
<p>So long as satellites and batteries don’t go dead, GPS and laser range finders now solve the land navigation and ranging problems in an unbiased manner.&nbsp; Still, the questions that motivated this post remain: is the notion of useful bias itself useful?&nbsp; That is, are there other, more pacific examples in the cognitive realm? </p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2007/03/useful_bias.html/feed</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>Truth is stranger than fiction</title>
		<link>http://www.overcomingbias.com/2007/02/truth_is_strang.html</link>
		<comments>http://www.overcomingbias.com/2007/02/truth_is_strang.html#comments</comments>
		<pubDate>Tue, 13 Feb 2007 15:06:10 +0000</pubDate>
		<dc:creator>Andrew</dc:creator>
				<category><![CDATA[Bayesian]]></category>
		<category><![CDATA[Sports]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/02/truth-is-stranger-than-fiction.html</guid>
		<description><![CDATA[
 <a href="http://www.overcomingbias.com/2007/02/truth_is_strang.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Robin asks the following question <a href="http://www.overcomingbias.com/2007/02/is_truth_in_the.html">here</a>:</p>
<blockquote><p>How does the distribution of truth compare to the distribution of opinion?&nbsp; That is, consider some spectrum of possible answers, like the point difference in a game, or the sea level rise in the next century. On each such spectrum we could get a distribution of (point-estimate) opinions, and in the end a truth.&nbsp; So in each such case we could ask for truth&#8217;s opinion-rank: what fraction of opinions were less than the truth?&nbsp; For example, if 30% of estimates were below the truth (and 70% above), the opinion-rank of truth was 30%. </p>
<p>If we look at lots of cases in some topic area, we should be able to collect a distribution for truth&#8217;s opinion-rank, and so answer the interesting question: in this topic area, does the truth tend to be in the middle or the tails of the opinion distribution?&nbsp; That is, if truth usually has an opinion rank between 40% and 60%, then in a sense the middle conformist people are usually right.&nbsp; But if the opinion-rank of truth is usually below 10% or above 90%, then in a sense the extremists are usually right.</p>
</blockquote>
<p>My response:</p>
<p>1.&nbsp; As Robin notes, this is ultimately an empirical question which could be answered by collecting a lot of data on forecasts/estimates and true values.</p>
<p>2.&nbsp; However, there is a simple theoretical argument that suggests that truth will be, generally, more extreme than point estimates, that the opinion-rank (as defined above) will have a distribution that is more concentrated at the extremes as compared to a uniform distribution.</p>
<p>The argument goes as follows:</p>
<p>  <span id="more-18227"></span>
<p>Suppose that everybody&#8217;s Bayesian, everybody has the same prior distribution, but with different small amounts of data.&nbsp; To give some notation:&nbsp; suppose we will be looking at a sequence of parameters, theta_1, theta_2, theta_3, &#8230; with a common prior distribution p(theta), which represents the true distribution of this population of theta&#8217;s.&nbsp; (We could further suppose a hierarchical structure, so that p(theta) has hyperparameters that are estimated from data, but this is not necessary for our discussion here.)&nbsp; For simplicity, suppose p(theta) is a normal (bell-shaped) curve centered at 0 with standard deviation sigma.</p>
<p>Now suppose you get some data, y, on a parameter, theta, and summarize your inference by a point estimate which is your posterior mean, theta.hat = E(theta|y).&nbsp; Averaging over all possible data y that you might see, this posterior mean a sampling distribution which is centered about 0 but with a standard deviation <em>less than</em> sigma.&nbsp; This derives from an application of the basic variance-decomposition inequality:&nbsp; var(theta.hat) = var(E(theta)|y) = var(theta) &#8211; E(var(theta|y)), which tells us that the theta.hat&#8217;s are less variable than the underlying thetas.&nbsp; (This is a point we make in our paper, <a href="http://www.stat.columbia.edu/%7Egelman/research/published/allmaps.pdf">All Maps of Parameter Estimates are Misleading</a>, and it also is discussed in some papers by Tom Louis.)</p>
<p>I posted a simulation of this (along with R code) <a href="http://www.stat.columbia.edu/%7Ecook/movabletype/archives/2007/02/truth_is_strang.html">here</a>.&nbsp; (It seemed too technical to go into this blog.)</p>
<p>Getting back to Robin&#8217;s question:&nbsp; so, if everybody is Bayesian, using a prior distribution that correctly reflects the distribution of the underlying parameters being modeled, then, the point estimates will, on average, be closer to the center of the distribution as compared to the true values.&nbsp; (To put it another way, the parameter estimates are shrunk toward the prior mean.)&nbsp; And so the truth will look stranger than fiction&#8211;if fiction is thought of as point estimates!</p>
<p>3.&nbsp; This point arises in many statistical examples:&nbsp; one&#8217;s best guess is inherently more sober than what might possibly happen, which is one argument for considering fanciful possibilities in fiction. Taking your best point estimate at every step of the way will <em>not</em> give a realistic simulation of reality.&nbsp; Reality occasionally includes the unexpected.</p>
<p>4.&nbsp; We can apply this reasoning to sports scores, for example. Football games can be predicted to an accuracy of about 14 points (that is, the difference between the score differential and the point spread has an approximate normal distribution with mean 0 and standard deviation 14); see chapter 1 of Bayesian Data Analysis and some data <a href="http://www.stat.columbia.edu/%7Egelman/book/data/football.asc">here</a>.&nbsp; Looking at these data:</p>
<p>- The average difference between winner&#8217;s and loser&#8217;s score is 12 points.<br />- The average spread (point prediction of difference between winner and loser) is 5.3 points.<br />- 71% of the time, the score is more extreme (in difference between winner&#8217;s and loser&#8217;s score) than the spread.&nbsp; (The favorite beats the spread in about half the games, and in another 20% or so of the games, the underdog actually wins by a larger margin than the favorite was favored.)<br />- The distribution of actual game outcomes (as measured by score differentials) is more extreme than the distribution of the point predictions.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2007/02/truth_is_strang.html/feed</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>Sick of Textbook Errors</title>
		<link>http://www.overcomingbias.com/2007/01/sick_of_textboo.html</link>
		<comments>http://www.overcomingbias.com/2007/01/sick_of_textboo.html#comments</comments>
		<pubDate>Sun, 28 Jan 2007 02:01:09 +0000</pubDate>
		<dc:creator>Carl Shulman</dc:creator>
				<category><![CDATA[Bayesian]]></category>
		<category><![CDATA[Medicine]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/sick-of-textbook-errors.html</guid>
		<description><![CDATA[
 <a href="http://www.overcomingbias.com/2007/01/sick_of_textboo.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>One of the most well-worn examples in introductions to <a href="http://en.wikipedia.org/wiki/Bayes'_theorem">Bayesian</a> reasoning is testing for rare diseases: if the prior probability that a patient has a disease is sufficiently low, the probability that the patient has the disease conditional on a positive diagnostic test result may also be low, even for very accurate tests. One might hope that every epidemiologist would be familiar with this textbook problem, but <a href="http://www.nytimes.com/2007/01/22/health/22whoop.html?ref=health">this New York Times story</a> suggests otherwise: </p>
<blockquote><p>For months, nearly everyone involved thought the medical center had had a huge whooping cough outbreak, with extensive ramifications. [...] </p>
<p>Then, about eight months later, health care workers were dumbfounded to receive an e-mail message from the hospital administration informing them that the whole thing was a false alarm.</p>
<p>Now, as they look back on the episode, epidemiologists and infectious disease specialists say the problem was that they placed too much faith in a quick and highly sensitive molecular test that led them astray.</p>
</blockquote>
<p>While medical professionals can <a href="http://www.springerlink.com/content/p42285x4123j5705/">modestly improve their performance on inventories of cognitive bias when coached</a>, we should not overestimate the extent to which formal instruction such as statistics or epidemiology classes will improve actual behavior in the field.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2007/01/sick_of_textboo.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Symmetry Is Not Pretty</title>
		<link>http://www.overcomingbias.com/2007/01/symmetry_is_not.html</link>
		<comments>http://www.overcomingbias.com/2007/01/symmetry_is_not.html#comments</comments>
		<pubDate>Sat, 20 Jan 2007 10:00:00 +0000</pubDate>
		<dc:creator>Robin Hanson</dc:creator>
				<category><![CDATA[Psychology]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prod.ob.trike.com.au/2007/01/symmetry-is-not-pretty.html</guid>
		<description><![CDATA[
 <a href="http://www.overcomingbias.com/2007/01/symmetry_is_not.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://chattyapes.com/blog/2007/01/lessons_from_studying_symmetry.html"><em>Chatty Apes</em></a> we learn that symmetry has little to do with whether a face is attractive:</p>
<blockquote><p>Measurable symmetry accounts for less than 1% of the variance in the attractiveness of women&#8217;s faces and less than 3% of the variance of the attractiveness of men&#8217;s faces.&nbsp; &#8230; the initial studies showing big effects typically involved samples of less than 20 faces each, which is irresponsibly small for correlational studies with open-ended variables.&nbsp; Once the bigger samples starting showing up, the effect basically disappeared for women and was shown to be pretty low for men.&nbsp; But no one believed the later, bigger studies, even most of their own authors &#8212; pretty much everyone in my business still thinks that symmetry is a big deal in attractiveness.&nbsp; So, the first lesson I learned:&nbsp; Small samples are &#8230;&nbsp; My solution has been to ditch the old p&lt;.05 significance standard. </p>
</blockquote>
<p>I see the same thing in health economics; once people see some data supporting a&nbsp; theory that makes sense to them, they neglect larger contrary data.&nbsp; &nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2007/01/symmetry_is_not.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Malatesta Estimator</title>
		<link>http://www.overcomingbias.com/2006/12/malatesta_estim.html</link>
		<comments>http://www.overcomingbias.com/2006/12/malatesta_estim.html#comments</comments>
		<pubDate>Thu, 14 Dec 2006 17:48:00 +0000</pubDate>
		<dc:creator>Adrian Tschoegl</dc:creator>
				<category><![CDATA[Disagreement]]></category>
		<category><![CDATA[Standard Biases]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prod.ob.trike.com.au/2006/12/malatesta-estimator.html</guid>
		<description><![CDATA[
 <a href="http://www.overcomingbias.com/2006/12/malatesta_estim.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><span style="FONT-FAMILY: Garamond">We frequently encounter competing estimates of politically salient magnitudes. One example would be the number of attendees at the 1995 “Million Man March”.<span style="mso-spacerun: yes">&nbsp; </span>Obviously, frequently the estimates emanate from biased observers seeking to create or dispel an impression of strength.<span style="mso-spacerun: yes">&nbsp; </span>Someone interested in generating a more neutral estimate might consider applying what I would call the Malatesta Estimator, which I have named after its formulator, the 14<sup>th</sup> Century Italian mercenary captain, Galeotto Malatesta of Rimini (d. abt. 1385). His advice was: “Take the mean between the maximum given by the exaggerators, and the minimum by detractors, and deduct a third” (Saunders 2004).<span style="mso-spacerun: yes">&nbsp; </span>This simplifies into: the sum of the maximum and the minimum, divided by three.<span style="mso-spacerun: yes">&nbsp; </span>It adjusts for the fact that the minimum is bounded below by zero, while there is no bound on the maximum.<span style="mso-spacerun: yes">&nbsp; </span>Of course, it only works if the maximum is at least double the minimum.</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt">
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><span style="FONT-FAMILY: Garamond">In the case of the Million Man March, supporters from the Nation of Islam claimed attendance of 1.5 to 2 million.<span style="mso-spacerun: yes">&nbsp; </span>The Park Service suggested initially that 400,000 had participated.<span style="mso-spacerun: yes">&nbsp; </span>The Malatesta Estimator therefore yields an estimate of 800,000.<span style="mso-spacerun: yes">&nbsp; </span>We can calibrate this by comparing it with an estimate by Dr. Farouk El-Baz and his team at the Boston University Remote Sensing Lab.<span style="mso-spacerun: yes">&nbsp; </span>Dr. El-Baz and his team used samples of 1 meter square pixels from a number of overhead photos to estimate the density per pixel, and then calculated an estimate for the entire area.<span style="mso-spacerun: yes">&nbsp; </span>Their estimate was 837,000, with 20% error bounds giving a range from 1 million to 670,000.</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt">
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt">
<place w:st="on"></place><city w:st="on"></city></p>
<p><span style="FONT-FAMILY: Garamond">Saunders, </span><span style="FONT-FAMILY: Garamond">Frances </span><span style="FONT-FAMILY: Garamond">Stonor. 2004. <em>The Devil’s Broker: Seeking Gold, God, and Glory in Fourteenth-Century Italy</em>. (New York: HarperCollins), p. 93. </span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt">
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><span style="FONT-FAMILY: Garamond">BU Remote Sensing Lab Press Release: http://www.bu.edu/remotesensing/Research/MMM/MMMnew.html</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt"><span style="FONT-FAMILY: Garamond">Accessed 14 December 2006.</span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.overcomingbias.com/2006/12/malatesta_estim.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk (enhanced)
Database Caching using disk
Object Caching 718/872 objects using disk
Content Delivery Network via Amazon Web Services: S3: overcomingbias-assets.s3.amazonaws.com

Served from: www.overcomingbias.com @ 2012-02-11 14:59:22 -->
