Author Archives: Robin Hanson

Land Speculators Made U.S.

While a U.S. citizen for 63 years, I’d never before heard this story of U.S. origins, told well by Christopher Blattman is his new book Why We Fight (pp. 38-41). Seems the U.S. revolution was a textbook example of war due to elite interests diverging from those of most citizens.  I quote:

Born in 1732, the middle child of an undistinguished tobacco farmer, George Washington found himself on the fringes of Virginia’s elite planter society. Luckily, his older brother married into one of the colony’s most powerful families. Now the tall, lanky young man found himself with powerful patrons. Those benefactors pulled strings to maneuver Washington into a coveted public office: county surveyor. 

Mapping land boundaries promised little profit in well-settled Virginia. Yet to the west, across the Allegheny Mountains, lay millions of acres of unclaimed land—assuming you ignored the native inhabitants, not to mention the French. Within days of his appointment, George Washington headed to the frontier. The young man would help his patrons lay claim to the best lands and scout some choice properties for himself. He was just seventeen. 

An acquisitive zeal consumed the young Virginian and his backers. Claiming, hoarding, and flipping cheap land was an obsession across all thirteen colonies. Most great fortunes in the colonies had come from land speculation. Unfortunately for Washington and his patrons, however, France shared their bottomless appetite for territory. French troops began building a string of forts down the fertile Ohio River Valley, right around modern-day Pittsburgh. They ran straight through the claims Washington had staked. 

In response, Washington’s powerful patrons maneuvered him again, this time to the head of an armed force. Tall and broad-shouldered, Washington looked the part of a military leader. He also showed real talent for command. So his wealthy backers sent him west at the head of an American and Iroquois militia. He was twenty-two. 

France’s colonial forces far outnumbered Washington’s small party. The year was 1754, Britain and France were at peace, and the French hoped to seize the Ohio River Valley without a shot. As the ragtag Virginian militia marched north toward the French Fort Duquesne, the fort’s commander sent a diplomatic force to intercept Washington and parley. They wanted to make a deal. 

Warned of the French party coming his way, unsure of their intent, Washington made a fateful decision: he would ambush and overpower the approaching men. He marched his forces through the rainy, moonless night and launched a sneak attack. 

What happened next is unclear and disputed. Most think the French diplomatic force, taken by surprise, surrendered without a shot. Probably the inexperienced young Washington then lost control of his warriors. We know his militia and their Iroquois guides murdered and scalped most of the French party, including the ambassador. We also know that, as he sat down to write the governor an update, this political catastrophe wasn’t even the most important thing on his mind. Before getting to the night’s grisly events, Washington spent the first eight paragraphs griping about his low pay. 

A British politician summed up the consequences: “The volley fired by a young Virginian in the backwoods of America set the world on fire.” Washington’s ambush sparked a local conflict. Two years later, it escalated into what Europeans call the Seven Years’ War. The conflict drew in all Europe’s great powers, lasting until 1763. Washington’s corrupt and clumsy land claims helped ignite a long, deadly, global conflict.

This is not the typical origin story Americans have long been taught. A more familiar tale portrays Washington as a disciplined, stoic, honorable leader. It describes a man whose love of liberty led him to risk his life and his fortune for independence. It describes a revolution with ideological origins, not selfish ones. 

This nobler description is accurate. But what is also true—what biographer after biographer has described, but what schoolbooks sometimes overlook—is that land and his own personal fortune were also at the front of the first president’s mind. “No theme appears more frequently in the writings of Washington,” writes one biographer, “than his love for the land—more precisely, his own land.” Another theme is decadence. George Washington was a profligate consumer. He desired the finest carriages, clothes, and furniture. Land rich and cash poor, he financed his luxurious lifestyle with enormous loans from British merchants. 

This relentless quest for wealth dominated Washington’s pre-revolutionary years. After the Seven Years’ War, he amassed huge western claims. A few he bought legitimately. In some cases, he skirted laws, shadily buying under an assumed name or that of a relative. Other lands he acquired at the expense of his own militiamen—or so some of these angry veterans claimed. As a result of this scheming, Washington died the richest American president of all time. One ranking has him as the fifty-ninth richest man in US history. 

How did these private interests shape Washington’s decision to revolt against Britain, two decades later? Elsewhere in this book we will see the American Revolution had many causes, including a newfound and noble ideology of self-determination. We can’t understand the revolution without that. But we would be foolish to ignore the economic self-interest of the founding fathers, like Washington, as well as the war bias that fostered. 

The greatest threat to George Washington’s wealth was continued union with Britain. By the 1770s, the British Crown had invalidated some of Washington’s more questionable landholdings. Britain also pledged most of the Ohio River Valley to Canada—including some of Washington’s most valuable claims. He would have to relinquish all he’d accumulated. 

The same was true for many who signed the Declaration of Independence. Like Washington, these elites had an incredible amount to lose from British colonial policy. Most Americans at the time opposed a revolutionary war, but then most Americans couldn’t vote in those early years. The founding fathers faced a different set of risks and returns. It is no coincidence that they enjoyed privileges that British colonial policy would undermine—trade interests, vast western landholdings, ownership of enslaved people, and the local legislatures they controlled. If this colonial political and commercial class could not get Britain to revise its trade and commercial rulings, only independence could preserve their privileges. 

We need to consider these elite incentives if we’re going to ask why the revolution took place. A lot of people see it as inevitable. But Canada and Australia found peaceful paths to independence from Britain. If we’re going to take the theory behind this book seriously, then shouldn’t the thirteen colonies and Britain have also found a bargain without a fight? The revolution’s slogan was “No taxation without representation.” Why not strike that deal? We will see several answers in this book. One of them, however, is unchecked private interests. These do not explain the American Revolution on their own, but they certainly made peace more fragile.

Added 8Nov: Jeff Hummel disagrees with many aspects of the above account.

GD Star Rating
Tagged as: , , ,

Sacred Pains

We see sacred things from afar, even when they are close, so we can see them the same. I’ve previously described the main cost of this as impaired vision. That is, we don’t see sacred things as well, and so make mistakes about them. But another perhaps equally big cost of this is: sacrifice. We feel inclined to sacrifice for the sacred, and to encourage or even force others to sacrifice for it, even when that doesn’t much promote this sacred thing. And sacrifice often involves: pain.

For example, someone recently used “torturing babies” to me as the one thing we can all agree is most wrong. But we actually continue to needlessly torture babies via circumcision. We once did it as a sacrifice for religion, and more recently as a sacrifice for medicine. If we are told that doctors say circumcision is healthy, that’s a sufficient reason to torture babies.

We treat love as sacred, and we often test our lovers and potential lovers, to see how strong is their love. And these tests quite often hurt, a lot. We’d feel more guilty to hurt them in the name of a less noble cause. But love, that cause is so grand as to justify most any pain. Romeo and Juliet suffer stupendously in the Shakespeare tale, and we treat them as having made the right choice, even given their terrible end.

We start wars and we continue them, when we have other options, in the name of sacred causes. Wars result in terrible pain and suffering, which we celebrate as sacrifices for our causes.

As democracy is sacred to us, so are our political fights to influence democracy. And thus so are the sloppy biased arguments we embrace, the mud we throw, the insults we fling, the relations we break off, and the lives we cancel, all in the name of our sacred political fights.

As nature is sacred, we are eager to sacrifice for it. So we are suspicious of solving global warming via nuclear energy or hydroelectricity, as those don’t seem to call for sufficient sacrifice. We’d really rather crush the economy, that will show how much we care about nature.

We feel so elevated to be treating something as sacred. And thus are eager to cause sacrifice in its name. Which often doesn’t seem such an elevated an outcome to me.

GD Star Rating
Tagged as:

Why We Don’t Know What We Want

Moons and Junes and Ferris wheels
The dizzy dancing way that you feel
As every fairy tale comes real
I’ve looked at love that way

But now it’s just another show
And you leave ’em laughing when you go
And if you care, don’t let them know
Don’t give yourself away

I’ve looked at love from both sides now
From give and take and still somehow
It’s love’s illusions that I recall
I really don’t know love
Really don’t know love at all

Both Sides Now, Joni Mitchell 1966.

If you look at two things up close, it is usually pretty easy to tell which one is closest. And also to tell their relative sizes, e.g., which one might fit inside the other. But if you look far in the distance, such as toward the sky or the horizon, it gets much harder to tell relative sizes or distances. While you might notice that one thing occludes another, when considering unknown things in different directions it is harder to tell relative sizes or distances.

I see similar effects also for things that are more “distant” in other ways, such as in time, social distance, or hypothetically; it also seems harder to judge relative distance when things are further away in these ways. Furthermore, it seems harder to tell of two abstract descriptions which is more abstract, but easier to tell which of two detailed things which has more detail. Thus in the sense of near-far (or construal-level) theory, it seems that we generally find it harder to compare relative distances when things are further away.

According to near-far theory, we also frame our more stable, general, and fundamental goals as more far and abstract, compared to the more near local considerations that constrain our plans. Thus this theory seems to predict that we will have more trouble comparing the relative value of our more abstract values. That is, when comparing two general persistent values, we will find it hard to say which one we value more. Thus near-far theory predicts a big puzzling human feature: we know surprisingly little about what we want. For example, we find it very hard to imaging concrete, coherent, and attractive utopias.

When we see an object from up close, and then we later see it from afar, we often remember its details from when we saw it up close. So similarly, we might learn to compare our general values by remembering examples of concrete decisions where such values were in conflict. And we do often have concrete situations where we are aware that our general values apply to those concrete cases. Such as when we are very hungry, horny, injured, or socially embarrassed. Why don’t we learn our values from those?

Here I will invoke my theory of the sacred: for some key values and things, we set our minds to try to always see them in a rather far mode, no matter how close we are to them. This enables different people in a community to bond together by seeing those sacred things in the same way, even when some of them are much closer to them than others. And this also enables a single person to better maintain a unified identity and commitments over time, even when that person sees concrete examples from different distances at different times in their life. (I thank Arnold Brooks for pointing this out in an upcoming MAM podcast.)

For example, most of us have felt strong feelings of lust, limerence, and attachment to other people at many times during our lives. So we should have plenty of data on which to base rough estimates of what exactly is “love”, and how much we value it compared to other things. But our treating love as sacred makes it harder to use that data to construct such a detailed and unified account. Even when we think about concrete examples up close, it seems hard to use those to update our general views on “love”. We still “really don’t know love at all.”

Because we really can’t see love up close and in detail. Because we treat love as sacred. And sacred things we see from afar, so we can see them together.

GD Star Rating
Tagged as: , ,

What Will Be Fifth Meta-Innovation?

We owe pretty much everything that we are and have to innovation. That is, to our ancestors’ efforts (intentional or not) to improve their behaviors. But the rate of innovation has not been remotely constant over time. And we can credit increases in the rate of innovation to: meta-innovation. That is, to innovation in the processes by which we try new things, and distribute better versions to wider practice.

On the largest scales, innovation is quite smooth, being mostly made of many small-grain relatively-independent lumps, which is why the rate of overall innovation usually looks pretty steady. The rare bigger lumps only move the overall curve by small amounts; you have to focus in on much smaller scales to see individual innovations making much of a difference. Which is why I’m pretty skeptical about scenarios based on expecting very lumpy innovations in any particular future tech.

However, overall meta-innovation seems to be very lumpy. Through almost all history, innovation has happened at pretty steady rates, implying negligible net meta-innovation at most times. But we have so far seen (at least) four particular events when a huge quantity of meta-innovation dropped all at once. Each such event was so short that it was probably caused by one final key meta-innovation, though that final step may have awaited several other required precursor steps.

First natural selection arose, increasing the rate of innovation from basically zero to a positive rate. For example, over the last half billion years, max brain size on Earth has doubled roughly every 30 million years. Then proto-humans introduced culture, which allowed their economy (tracked by population) to double roughly every quarter million years. (Maybe other meta-innovations arose between life and culture; data is sparse.) Then ten thousand years ago, farming culture allowed the economy (tracked by population) to double roughly every thousand years. Then a few hundred years ago, industrial culture allowed the economy (no longer tracked by population) to double every fifteen years.

So these four meta-innovation lumps caused roughly these four factors of innovation growth rate change: 60,120, 240, infinity. Each era of steady growth between these changes encompassed roughly seven to twenty doublings, and each of these transitions took substantially less than a previous doubling time. Thus while a random grain of innovation so far has almost surely been part of a rather small lump of innovation, a random grain of meta-innovation so far has almost surely part of one of these four huge lumps of meta-innovation.

What caused these four huge lumps? Oddly, we understand the oldest lumps best, and recent lumps worse. But all four seems to be due to better ways to diffuse, as opposed to create, innovations. Lump 1 was clearly the introduction of natural selection, where biological reproduction spreads innovations. Lump 2 seems somewhat clearly cultural evolution, wherein we learned enough how to copy the better observed behaviors of others. Lump 3 seem plausibly, though hardly surely, due to a rise in population density and location stability inducing a change from a disconnected to a fully-connected network of long-distance travel, trade, and conquest. And while the cause of lump 4 seems the least certain, my bet is the rise of “science” in Europe, i.e., long distance networks of experts sharing techniques via math and Latin, enhanced by fashion tastes and noble aspirations.

Innovation continues today, but at a pretty steady rate, suggesting that there has been little net meta-innovation recently. Even so, our long-term history suggests a dramatic prediction: we will see at least one more huge lump, within roughly another ten doublings, or ~150 years, after which the economy will double in roughly a few weeks to a few months. And if the cause of the next lump is like the last four, it will be due to some new faster way to diffuse and spread innovations.

Having seen a lot of innovation diffusion up close, I’m quite confident that we are now no where near fundamental limits on innovation diffusion rates. That is, we could do a lot better. Another factor of sixty doesn’t seem crazy. Even so, it boggles the mind to try to imagine what such a new meta-innovation might be. Some new kind of language? Direct brain state transfer? Better econ incentives for diffusion? New forms of social organization?

I just don’t know. But the point of this post is: we have good reason to think such a thing is coming. And so it is worth looking out for. Within the next few centuries, a single key change will appear, and then within a decade overall econ growth would increase by a factor of sixty or more. Plausibly this will be due to a better way to diffuse innovations. And while the last step enabling this would be singular, it may require several precursors that appear at different times over the prior period.

My book Age of Em describes another possible process by which econ growth could suddenly speed up, to doubling in weeks or months. I still think this is plausible, but my main doubt is that the main reason I had predicted much faster growth there was not due to betters way to diffuse innovations in this scenario. Making this scenario a substantial deviation from prior trends. But maybe I’m wrong there.

Anyway, I’m writing here to say that I’m just not sure. Let’s keep an open mind, and keep on the lookout for some radical new way to better diffuse innovation.

Added 6a: Note that many things that look like plausible big meta-innovations did not actually seem to change the growth rate at the time. This includes sex, language, writing, and electronic computing and communication. Plausibly these are important enabling factors, but not sufficient on their own.

GD Star Rating
Tagged as: , , ,

New Tax Career Agent Test

If that taxpayer approved, the taxes that he or she pays to the government could be diverted, and instead delivered to a “tax career agent“, who would have, in an auction, won and paid for the right to get such diverted payments from that particular taxpayer. For the government, this would like borrowing, i.e., a way to convert future tax payments into current revenue. This agent would now have incentives to advise and promote this taxpayer, but would have no unusual powers to influence this taxpayer’s behavior.

Previously I used a poll to estimate that career agents today who get 10% of client wages as a result raise those wages by 1.5% on average, suggesting that tax career agents (TCAs) who got ~20% of income might raise those same wages by 3%. But as this effect might be smaller for random workers, and as worker welfare gains would be less than wage gains, I estimated that TCAs raise worker welfare by ~1% on average, which at a real interest rate of 2% suggests a ~$20T present value to the world from adopting TCAs.

In my last post, I sketched a simple experiment design to test the TCA concept: give N random people TCAs, and track their income changes compared to N others who don’t. If TCAs raised wages by 0.3% per year, then given the usual random noise in wage changes, a ten year experiment with N=7000 seems sufficient, but an upper bound cost on this is ~$32M. Which is crazy cheap (~ a part in a million!) relative to TCA social value, but in our broken world we probably need something cheaper.

Here is my new concept: create a TCA for each worker, but get two auction prices per worker, one price if the TCA is active, i.e., free to promote and advise that worker, and a different price if the TCA is instead passive, i.e., prevented from helping this worker. Then randomly pick if the worker gets an active or passive TCA, and use the appropriate bids and prices to pick and charge the new TCA.

If there is sufficient competition in the bidding, then the difference between those two prices is a direct market estimate of how much bidders expect an active TCA to raise worker wages, minus the effort they expect an active TCA expect to put in to make this happen. This estimate is available per worker, and immediately at the experiment start. So even an N=100 experiment at a TCA expense cost of ~$1M for could give valuable data!

In addition to getting TCAs to estimate worker wage increases minus TCA costs, we might also want to get workers to estimate their welfare gains. And we could do this by putting workers into pairs, only one of which gets an active TCA, and making them bid against each other to see who gets that active TCA. Bids should give direct estimates of worker value (i.e., increased wages minus extra effort or inconvenience) if the winning bidder pays the lower bid price. These worker value estimates are also available per worker, and immediately at experiment start. And the extra revenue from worker bids cuts the cost of the experiment.

TCAs and workers would have strong incentives to make good estimates, but their estimates would still be based on pretty limited information. To get better informed estimates, it would help to spread this experiment out across time, and give later participants as much info as possible about earlier participant outcomes. The more time that elapses between the first and last TCA auctions, the more later participants will know, but the longer it will take to learn results from this experiment. Note that such a sequential approach also allows the experiment to better manage its expenses in the face of an initially uncertain costs per worker participant.

Here is a more detailed design based on the above concepts. Offer random workers a sufficient compensation (1% tax rebate?) so that most who are invited agree to participate for Y years. (If Y is short, pick post-college-age workers, so their choice of more schooling is less of an issue.) Participants allow substantial info on them, including their taxes, to be revealed to experimenters and other participants. Match participants into pairs who seem as similar as possible, then auction off these pairs one at a time in sequence over many years, showing all qualified bidders info on outcomes for all prior participants.

Each worker in each pair is asked for the bid B they would pay for a higher chance to be assigned the active, as opposed to passive, TCA for Y years. In addition, each pair auction has eight TCA auction prices, each qualified TCA bidder can bid on any or all of these eight prices, and the highest price wins each price auction, paying the second highest among its submitted prices. To prevent collusion within worker pairs, workers are given little info on their pair partners until they have set their bids.

The eight prices come from all combinations of three binary factors. First, there are the two workers, who will differ somewhat in their info. Second, there are different prices to become an active or passive TCA for Y years. Third, there are different prices depending on if the worker submitted the higher or lower bid to get the active TCA. Worker bids are kept secret until all eight TCA auction prices are set. Then the worker who bid more gets a 2/3 chance of being assigned the active TCA, and a 1/3 chance of being assigned the passive TCA. Given a bid B, we can estimate their added value V of having an active agent via V = 3B.

Note that at a 2% discount rate, the present value of 20% of the median US wage of $31K is ~$450K, 1% of which is ~$4.5K, implying a bid of ~$1.5K, an amount most workers can afford to pay.

This experimental design seems sufficient to extract key info re TCAs at a low cost. But it still needs more work. For example, we need tax experts to think about which parts of typical tax returns to include or not in TCA payments. We need finance experts to think about how to get sufficient numbers of competing TCA bidders, and how the experiment can hold and invest auction assets deposited, to minimize the costs and risks associated with paying off all TCAs as promised. We need labor experts to think about what worker info is sufficient to inform TCA bids. And we need legal experts to figure out how we can do all this within existing law. Any such experts want to help?

Added 20Nov: A similar test could be applied to my Buy Health proposal. For each possible patient get auction participants to bid on their price to provide health and life insurance separately, where different orgs provide the different types and aren’t allowed to coordinate, or as a bundle from a single org that can coordinate. See the per-patient estimated difference in death risk and medical spending.

GD Star Rating
Tagged as: ,

Testing Tax Career Agents

Agents who are paid a larger fraction of their client’s income have a stronger incentive to promote and advise those clients. Thus the tax career agent idea makes more sense for governments that tax a larger fraction of citizen income. So they make less sense for city or state governments. But a national government may not be willing to try it without seeing results from a smaller experiment. So how could we make a smaller experiment to test the concept?

The big advantage of tax career agents is that they are basically free to create. As the government already sits in that role, all it has to do is transfer that role to someone else, at almost no net cost to anyone. But to test the idea of tax career agents, we don’t have to rely on the fact that such agents are free for governments to create; we can pay extra to create such agents privately, just for the test.

To create a test tax career agent regarding client C, we could hold an auction to see who is willing to pay the most to, every year for the next Y years, be paid an amount equal to what C pays that year in income taxes. If agent A were a real tax career agent, the money paid to A would come from what C actually pays their government in taxes. But for the purpose of an experiment, this money paid to A could instead from the budget of the experiment. This alternate payment source should not matter much for A and C behavior.

So auction winners would first give large auction win payments to the experiment, after which the experiment would commit to paying them back each year when their clients pay taxes. In the meantime, the experiment would hold and invest these assets. As the experiment should be able to invest as well as agents, auction competition should induce the net cost of this experiment to be mainly the time and effort costs that agents expect to make advising and promoting clients.

In a prior post, I estimated that an agent A who got 20% of client C’s wages would increase those wages by 1-3%. As agents wouldn’t on average put in more efforts than they get paid for those efforts, that gives us an upper bound to the financial size of agent efforts. Thus if N clients with average income I each get a test tax career agent for Y years, the auction revenue to be collected and invested would be ~20%*Y*N*I, and the cost to create these agents would be ~1-3%*Y*N*I. (For Y large, these amounts are lower due to time discounting.) Note that much of this “cost” is actually a transfer to clients, who we expect to enjoy higher incomes.

Of course we’d want to track a similar-sized control group of N clients who didn’t get test tax career agents. And if we wanted to give experimental subjects the choice of if to create such an agent, then if only a fraction V volunteer to get such an agent, we’d want to track ~N/V workers who were offered the choice, and another ~N/V control workers not offered the choice. Note that we’d also need funds to manage the experiment, to collect data on participants, and to analyze the results.

And that’s a simple outline of the experiment design, including a rough estimate of its cost. In the U.S., 3% of $31K median income over ten years is $9.3K, which for N=1000 comes to $9.3M. This cost would of course be less for lower-income workers. Any want to do an analysis of what size N we’d want given to see significant results given this expected effect size?

Added 21Oct: Christopher McDonald did a Sample Size Analysis for Tax Career Agent Experiment. Assuming tax career agents improve wages on average by 0.3% per year, he finds that you’d need 7000 subjects over a ten year experiment to get a true-positive probability of 80% and a false-positive probability of 5%. So applying the same estimates as above, that gives an upper-bound cost of ~$30M. In my next post on this, I’ll outline a cheaper experiment design.

Added 12Nov: I just realized that I’d previously mis-calculated the wage rise to be 1-3%, instead of 10-30%. A smaller experiment would of course be required to see such a larger effect.

GD Star Rating
Tagged as: ,

Fossil Future

My main intellectual strategy is to explore important neglected topics where I can find an original angle to pursue. As a result, I tend to lose interest in topics as they get more attention. Which is why I’ve avoided climate change. Yes, it is plausibly important, but I’ve always seen it get plenty of attention, and I haven’t yet found an original angle on it. So I’ve let it lie.

But on the recommendation of my colleague Bryan Caplan, I’ve just read Alex Epstein’s contrarian new book Fossil Future. And my overall review is that, on the big issues, he’s basically right: CO2 induced planetary warming is going slowly, doesn’t remotely threaten extinction, and its harms will be more than offset by gains from our growing fossil-energy-powered wealth. It would be crazy to actually try to end fossil fuel use by 2050, as many are now “committing” to do; fossil fuels will likely remain our most cost-effective way to do many useful things long after that.

Most fundamentally, Epstein diagnoses the key problem well: the main emotional energy behind climate activism is the desire to stop humans from having any substantial impact on nature. In my terminology, they see nature as sacred, and thus as eternal, pure, not in conflict with other sacred things, and to be sharply distinguished from, not mixed with, and not at any price sacrificed for, profane things. We are not to calculate such choices, but to intuit them, aesthetically.

Epstein is right that our elite academic and media systems focus on a few celebrated and oft-quoted climate expert/activists, who are not that representative of the larger world of experts. And these activists are opposed to nuclear fission, nuclear fusion, and hydroelectricity, all of which avoid CO2 warming. They even sympathize with those who oppose new solar and wind energy projects, and any land developments, that have substantial impacts on nature. It seems that, thought they may deny it in public, what they really want is a smaller human world, with fewer people using less materials and energy.

My main complaint (echoing Caplan) is that Epstein avoids and sometimes seems to reject econ-style marginal thinking. For example, he doesn’t really distinguish the marginal value of more fossil energy from that of more other kinds of modern inputs and capital. And he doesn’t seem to want to admit that CO2 emissions might have mild negative externalities which could justify mild taxes. But given how big these intellectual errors are, I’m impressed that Epstein seems so consistently right on most everything else. I guess that’s because activists on the other side also tend to be little influenced by marginal thinking.

Epstein acts as if the only force strong enough to resist pressures from seeing nature as sacred is seeing something else as even more sacred. And for that Epstein picks: human flourishing. He treats that as so sacred that not even mild taxes on fossil fuels can be tolerated. As an economist I’m sad to think we can’t make a more reasonable choice in the middle, where everything we value gets traded off via conscious calculation mediated by mundane prices. But, alas, I don’t know that Epstein is wrong on this key sacredness point.

Added 8a: I think we see this Nature-as-sacred emotional energy in those eager to dismiss concerns about falling fertility leading to a smaller human population.

GD Star Rating
Tagged as: ,

Replace G.P.A. With G.P.C.?

Most schools assign each student a “grade point average”, i.e., a number that averages over many teacher evaluations of that student. Many schools also assign each teacher an “average student evaluation”, i.e., a number that averages over many student evaluations of that teacher. Many workplaces similarly post evaluations which average worker performance ratings across different tasks. And sport leagues often post rankings of teams, which average over team performance across many contests.

A lot rides on such metrics, even though they are simple aggregates over contests of varying difficulty, which creates incentives for players to “game” these metrics. For example, students seek to take, and teachers seek to teach, easy/fun classes; workers seek to do easy tasks, and sport teams seek to play easy opponents.

Yet we have long known of a better way, one I described briefly in 2001: stat-model-based summary evaluations.

For example, imagine that a college took all of their student grade transcripts as data, and from that made a best-fit statistical linear regression model. Such a model would predict the grade of each student in each class by using a linear combination of features of each class, such as subject, location, time of day and week, and also “fixed effects” for dates, professors, and especially students. That is, the regression formula would include a term in its sum for each student, a term that is a coefficient for that student, times one or zero depending on if that datum is about a grade for that student.

Such a fixed effects regression coefficient regarding a student should effectively correct for whether the student took easy or hard majors, classes, profs, times of day, year of degree, etc. Furthermore, standard stat methods would give us a “standard error” uncertainty range for this coefficient, so that we are not fooled into thinking we know this parameter more precisely than we do.

Thus a “grade point coefficient”, i.e., a G.P.C., should do better than a G.P.A. as a measure of the overall quality of each student. And the more that potential employers, grad schools, etc. focused on G.P.C.s instead of G.P.A.s, the less incentive students would have to search out easy classes, profs, etc. We could do the same for student evaluations of professors, and the more we relied on prof fixed effects to judge profs, then the less incentives they would have to teach easy classes, or to give students As to bribe them to give high evaluations.

The general idea is simple: fit performance data to a statistical model that estimates each performance outcome as a function of the various context parameters that one would expect to influence performance, plus a parameter representing the quality of each contestant. Then use those contestant parameter estimates as our best estimates of contestant quality. Such statistical models are pretty easy to construct, and most universities contain hundreds of people who are up to this task. And once such models are made and listened to, then contestants should focus more on improving their quality, and less on trying to game the evaluation metric.

Yes, as new data comes in, the models would get adjusted, meaning that contestant estimates would change a little over time, even after a contestant stopped having new performances. Yes, there will be questions of how many context parameters to include in such a model, but there are standard stat tools for addressing such questions. Yes, even after using such tools, there will remain some degrees of freedom regarding the types and functional forms of the model, and how best to encode key relevant factors. And yes, authorities can and would use those remaining degrees of freedom to get evaluation results more in their preferred directions.

But even so, this should be a huge improvement over the status quo. Instead of students looking for easy classes to get easier As, they’d focus instead on improving their overall abilities.

To prove this concept, all we need is one grad student (or exceptional undergrad) with stat training willing to try it, and one university willing to give that student access to their student transcripts (or student evals of profs). Once the models constructed passed some sanity tests, we’d try to get that university to let its students put their G.P.C.s onto their student transcripts. Then we’d try to get the larger world to care about G.P.C.s. So, who wants to try this?

P.S. I’ve posted previously on how broken are many of our eval systems, and how a better entry-level job eval system could allow such jobs to compete with college.

Added: This paper and this paper shows in detail how to do the stats.

One could get more than one useful number per student by adding terms that interact the student fixed effect terms with other features of classes. That second paper shows a two number system is more informative, but is rejected because “gains realized with the two-component index are offset by the additional complexity involved in explaining the two-component index to students, employers, college administrators and faculty.”

One might allow students to experiment with classes in new subjects by including a term that encodes such cases. One might include terms for race, gender, age, etc. of students, though I’d prefer transcripts to show student GPCs with and without such terms.

Added 17Oct: This book by Valen Johnson considers in detail models like those I describe above, wherein the performance of a student in a class is a linear combination of a student term, a class term, and an error. Except that sometimes instead of estimating a grade point, they instead estimate discrete grades, using several terms per class to describe the underlying parameter cutoffs between different discrete grades.

The student term sets an “adjusted GPA” and Johnson proposes to “allow students to optionally report adjusted GPAs on their transcripts.” He reports that when he attempted but failed to get Duke to do this in 1996, this was the biggest issue:

When the achievement index was considered for use as a mechanism to adjust GPAs for students at Duke, instructors who regularly assigned uniformly high grades quickly realized that the achievement index adjustment will make their grades irrelevant in the calculation of student GPAs. Worse still, many students notice the same thing. To thwart the adoption of the achievement index, these high-grading instructors and their student benefactors adopted the position that an A represented an objective assessment of student performance. An A was an A was an A. For them, it represented “excellent” performance on some well-defined but unobservable scale. Indeed, by the end of the debate, several literary theorists had finally identified an objective piece of text: a student grade. (p.222)

Apparently Johnson and others have long tried but failed to get schools to adopt GPCs and variations on them.

GD Star Rating
Tagged as: , ,

More Academic Prestige Futures

Academia functions to (A) create and confer prestige to associated researchers, students, firms, cities, and nations, (B) preserve and teach what we know on many general abstract topics, and (C) add to what we know over the long run. (Here “know” includes topics where we are uncertain, and practices we can’t express declaratively.)

Most of us see (C) as academia’s most important social function, and many of us see lots of room for improvement there. Alas, while we have identified many plausible ways to improve this (C) function, academia has known about these for decades, and has done little. The problem seems less a lack of knowledge, and more a lack of incentives.

You might think the key is to convince the patrons who fund academia to change their funding methods, and to make funding contingent on adopting other fixes. After all, this should induce more of the (C) that we presume that patrons seek. Problem is, just like all the other parties involved, patron motives also focus on more on function (A) than on (C). That is, state, firm, and philanthropic patrons of academia mainly seek to buy what academia’s other customers, e.g., students and media, also buy: (A) prestige by association with credentialed impressiveness.

Thus offering better ways to fund (C) doesn’t help much. In fact, history actually moved in the other direction. From 1600 to 1800, science was mainly funded via prizes and infrastructure support. But then prestigious scientific societies pushed to replace prizes with grants. Grants give scientists more discretion, but are worse for (C). Scientists won, however; now grants are standard, and prizes rare.

But I still see a possible route to reform here, based on the fact that academics usually deny that their prestige is arbitrary, to be respected only because others respect it. Academics instead usually justify their prestige in function (A) as proxies for the ends of function (B,C). That is, academics tend to say that your best way to promote the preservation, teaching, and increase of our abstract knowledge is to just support academics according to their current academic prestige.

Today, academic prestige of individuals is largely estimated informally by gossip, based on the perceived prestiges of particular topics, institutions, journals, funding sources, conferences, etc. And such gossip estimates the prestige of each of these other things similarly, based on the prestige of their associations. This whole process takes an enormous amount of time and energy, but even so it attends far more to getting everyone to agree on prestige estimates, than to whether those estimates are really deserved.

Academics typically say that such sacred an end as intellectual progress is so hard to predict or control that it is arrogant of people like you to think you can see how to promote such things in any other way than to just give your money to the academics designated as prestigious by to this process, and let them decide what to do with it. And most of us have in fact accepted this story, as this is in fact what we mostly do.

Thus one way that we could hope to challenge the current academic equilibrium is to create better clearly-visible estimates of who or what contributes how much to these sacred ends. If academics came to accept another metric as offering more accurate estimates than what they now get from existing prestige processes, then that should pressure them into adjusting their prestige ratings to better match these new estimates. Which should then result in their assigning publications, jobs, grants etc. in ways that better promote such ends. Which should thus improve intellectual progress, perhaps by large amounts.

And as I outlined in my last post, we could actually create such new better estimates of who deserves academic prestige, via creating complex impact futures markets. Pay distant future historians (e.g., in a century or two) to judge then which of our academic projects (e.g., papers) actually better achieved adding to what we know. (Or also achieved preserving and teaching what we know.) Also create betting markets today that estimate those future judgments, and suggest to today’s academics and their customers that these are our best estimates of who and what deserve academic prestige. (Citations being lognormal suggests this system’s key assumptions are a decent approximation.)

These market prices would no doubt correlate greatly with the usual academic prestige ratings, but any substantial persistent deviations would raise a question: if, in assigning jobs, publications, grants, etc., you academics think you know better than these markets prices who is most likely to deserve academic prestige, why aren’t you or your many devoted fans trading in those markets to make the profits you think you see? If such folks were in fact trading heavily, but were resisted by outsiders with contrary strong opinions, that would look better than if they weren’t even bothering to trade on their supposed superior insight.

Academics seeking higher market estimates about they and their projects would be tempted to trade to push up those prices, even though their private info didn’t justify such a move. Other traders would expect this, and push prices back down. These forces would create liquidity in these markets, and subsidize trading overall.

Via this approach, we might reform academia to better achieve intellectual progress. So who wants to make this happen?

GD Star Rating
Tagged as: , ,

Complex Impact Futures

Imagine a world of people doing various specific projects, where over the long run the net effect of all these projects is to produce some desired outcomes. These projects may interact in complex ways. To encourage people to do more and better such projects along the way, we might like a way to eventually allocate credit to these various projects for their contributions to desired outcomes.

And we might like to have good predictions of such credit estimates, available either right after project completion, so we can praise project supporters, or available before projects start, to advise on which projects to start. Such a mechanism could be applied to projects within a firm or other org re achieving that org’s goals, or to charity projects re doing various kinds of general good, or to academic projects re promoting intellectual progress. In this post, I outline a way to do all this.

First, let us assume that we have available to us “historians” who could in groups judge after the fact which of two actual projects had contributed the most to desired outcomes. (And assume a way to pay such historians to make them sufficiently honest and careful in such judgments.) These judgments might be made with noise, well after the fact, and at great expense, but are still possible. (Remember, the longer one waits to judge, the more budget one can spend on judging.)

Consider two projects that have relative strengths A and B in terms of the credit each deserves for desired outcomes. Assume further that the chance that a random group of historians will pick A over B is just A/(A+B). This linear rule is a standard assumption made for many kinds of sporting contests (e.g. chess), with contestant strengths being usually distributed log-normally. (E.g., chess “Elo rating” is proportional to a log of such a player strength estimate.)

Given these assumptions, project strength estimates can be obtained via a “tournament parimutuel” (a name I just made up). Let there be a pool of money associated with each project, where each trader who contributes to a pool gets the payoffs from that pool in proportion to their contributions.

If each project were randomly matched to another project, and random historian groups were assigned to judge each pair, then it would work to let the winning pool divide up the money from both pools, just as if there had been a simple parimutuel on that pair. Traders would then tend to set the relative amounts in each pool in proportion to the relative strengths of associated projects.

If judging were very expensive, however, then we might not be able to afford to have historians judge every project. But in that case it could work to randomize across projects. Pick sets of projects to judge, throw away the rest, and boost the amount in each retained pool by moving money from thrown-away (now boost-zero) pools into retained pools in proportion to pool size.

All you have to do is make sure that, averaged over the ways to randomly throw away projects, each project has a unit average boost. For example, you could partition the projects, and pick each partition set with a chance proportion to its pool size. With this done right, those who invest in pools should expect the same average payout as if all projects were judged, though such payouts would now have more variance.

Within a set of projects chosen for judging, any ways to pair projects to judge should work.  It would make sense to pair projects with similar strength estimates, to max the info that judging gives, but beyond that we could let judges pick, and at the last minute, pairs they think easier to judge, such as projects that are close to each other in topic spaces, or similar in methods and participants. Or pairs that they would find interesting and informative to judge.

Historians might even pick random projects to judge, and then look nearby to select comparison projects, as long as they ensured a symmetric choice habit, or corrected for asymmetries. (It can also work to allow judges to sometimes say they can’t judge, or to rank more than two projects at the same time.) It would be good if the might-be-paired network of connections between projects were fully connected across all projects.

Parimutuel pools can make sense when all pool contributions are made at roughly the same time, so that contributors have similar info. But when bets will be made over longer time durations, betting markets make more sense. Thus we’d like to have a “complex impact futures” market over the various projects for most of our long duration, and then convert such bets into parimutuel tournament holdings just before judging.

We can do that by letting anyone split cash $1 into N betting assets of the form “Pays $xinto p pool” for each of N projects p, where xp refers to the market price of this asset at the time when betting assets are converted to claims in a tournament parimutuel. At that time, each outstanding asset of the form “Pays $xp into p pool” is converted into $xp put into the parimutuel pool for project p.

This method ensures that project pool amounts have the ratios xp. Note that 1 = Sump=1N xp, that a logarithmic market scoring rule would work find for trading in these markets, and that via a “rest of field” asset we don’t need to know about all projects p when the market starts.

Thus traders in our complex impact futures markets should treat prices of these assets as estimates of the relative strength of projects p in the credit judging process. They’ll want to buy projects whose relative strength seems underestimated, and sell those that seem overestimated. And so these prices right after a project is completed should give speculators’ consensus estimate on that project’s relative credit for desired outcomes. And the prices on future possible projects, conditional on the project starting, give consensus estimates of the future credit of potential projects. As promised.

Some issues remain to consider. For example, how could we allow judging of pairs, and the choice of which pairs to judge, to be spread out across time, while allowing betting markets on choices that remain open to continue as long as possible into that process? Should judgements of credit just look at a project’s actual impact on desired outcomes, or should they also consider counterfactual impact, to correct for unforeseeable randomness, or others’ misbehavior? Should historians judge impact relative to resources used or available, or just judge impact without considering costs or opportunities? Might it work better to randomly pick particular an outcome of interest, and then only judge pairs on their impact re that outcome?

GD Star Rating
Tagged as: ,