Futarchy is a new financial-market-based form of governance. First described in 1999, over the last two years it has seen several apparently-successful field experiments. I here consider how to try it in the US federal government. I first consider its requirements as a method, then consider four different kinds of applications.
Requirements
To consider its applications, I don’t need to discuss how futarchy works, just its requirements and features. The requirements for any method depend on if one is using that method, already convinced of its value, or testing it, to see if it can be shown to work better than other methods.
Futarchy estimates key outcomes given decision options. To use futarchy, one needs:
for each decision, a set of discrete decision options,
one or more after-the-fact-measurable outcome metrics by which to judge decisions,
a community, some of who could acquire info on how outcomes vary with options,
a currency that members value, which can be paid to them conditional on performance.
numerical outcome values can eventually be made visible to this community,
different options plausibly lead to substantially different (>~1%) expected outcomes,
the options can be made visible to members before decisions are made, and
either futarchy directly decides, or some with decider info can participate in it.
To test futarchy, via statistics on how different trials work out, one additionally needs:
for significant stats, many decisions with similar options and outcomes,
for stats soon, short durations from decisions to outcome measurements, and
another status-quo decision-making method that can advise the same decisions.
Futarchy decisions are especially well-informed and robust to manipulation. The use value of a futarchy application area thus rises with:
how bad are current decisions, relative to futarchy’s potential,
how much is at stake in typical decisions there,
the rate at which similar decisions arise there, and
how valuable it is to have demonstrably neutral decisions, free of politics and corruption.
The value for testing also rises with topic prestige and the amount of data that can be collected.
For US Fed applications, the obvious community is the world (or maybe just US citizens). And as betting regulators do not apply to the US government, future US dollars are an obvious currency.
Here are four application strategies.
1. Biggest Decisions
One strategy is to apply futarchy to all of the biggest US policy proposals, such as bills to be signed by POTUS, or major executive orders and agency decisions. Estimate a standard set of aggregate metrics, such as US time-weighted-averages of stock prices, GDP, unemployment, lifespans, population, or satisfaction, each conditional on adopting a proposal. Even if usually no metric prefers a policy or its rejection, the few other cases might give great value. And futarchy could weigh in not just before a final decision, but early in the process to winnow possibilities.
This strategy should achieve large value, and has the fewest open design questions, but would generate less data to test futarchy relative to other methods, or to test variations to improve it.
2. Small Common Decisions
Another strategy is to target small but common decision types where we can identify relevant standard outcome measures. For example, for each project with a budget and/or a deadline, estimate actual spending and delivery dates, and also estimate these conditional on a few standard project changes (e.g., to leadership, budget, or requirements). A second example is, for each potential new hire, estimate their evaluation years later if hired. (Privacy issues here?)
This strategy can be varied in scale, and at larger scales might collect much data quickly, allowing us to test futarchy relative to other methods, and to test variations to improve it.
3. Agency Specific Metrics
A third strategy is to pick particular outcome metrics for particular agencies, and estimate those agency metrics conditional on key agency policy changes. For example, we might evaluate crime agencies on crime rates, prisons on recidivism rates, medical agencies on client lifespans, transportation agencies on passenger throughput and prices, energy agencies on energy quantity or prices, environmental agencies on environmental health, and cultural agencies on US cultural prestige metrics. For each agency, evaluate all major agency decisions, including changes to leadership, budgets, and policies.
This strategy needs more work to pick particular metrics for particular agencies. This would collect more data than strategy 1, but less than strategy 2.
4. Project Specific Metrics
A fourth strategy picks particular outcome metrics for particular kinds of agency decisions. For example, energy savings for particular DoE energy projects, GDP and employment for Federal reserve interest rate changes, and project-specific outcome metrics for particular procurement and infrastructure choices. Estimate metrics if fund or defund projects, or change leadership.
This strategy needs the most further work to pick metrics for particular kinds of decisions.
What about the issue of out-of-band financial rewards skewing the betting process?
For simplicity let's say there's a binary decision to be made: Does the Fed lower rates (A) or raise rates (B). Each possibility gets its own betting market, which the leaders at the Fed presumably consult in order to decide A or B.
Now if we presume that these market participants (bettors) have no financial stake other than the betting market itself, then the incentives are clear: Rational bettors will bet according to their true beliefs. This is what we want to achieve, an honest signal from the market.
However, if that presumption fails then it may become rational for a bettor to bet against their true beliefs, if that will nudge the outcome (A or B) in a direction that is favorable to them, external to the betting market. This problem is obviously most acute for wealthy bettors who may have a lot to gain or lose depending on outcome.
It seems this could work best in areas where the possibility of out-of-band incentives is low. Fed policy, on the other hand: Almost everyone has some financial stake in what they do.
A related question: Elon Musk purchased Twitter and destroyed a lot of its financial value. Was this a rational decision on his part?
You cite Florida orange juice commodity futures, which improve on government weather forecasts, as an example of the predictive nature of the market system. Aren't there examples of betting through the system? Businesspeople put their money down to develop a cannabis farm, for instance, in light of pending policy changes around the legalization of marijuana. Couldn't the betting segments be isolated throughout the process of government action (which is never instantaneous) to evaluate or predict the positive and negative outcomes for both the populace and the supervising agency?