Dissing Track Records

Years ago I was being surprised to learn that patients usually can’t pick docs based on track records of previous patient outcomes. Because, people say, that would invade privacy and make bad incentives for docs picking patients. They suggest instead relying on personal impressions, wait times, “bedside” manner, and prestige of doc med school or hospital. (Yeah, those couldn’t possibly make bad incentives.) Few ever study if such cues correlate with patient outcomes, and we actively prevent the collection of patient satisfaction track records.

For lawyers, most trials are in the public record, so privacy shouldn’t be an obstacle to getting track records. So people pick lawyers based on track records, right? Actually no. People who ask are repeatedly told: no practically you can’t get lawyer track records, so just pick lawyers based on personal impressions or the prestige of their law firm or school. (Few study if those correlate with client outcomes.)

A new firm Premonition has been trying to change that:

Despite being public record, court data is surprisingly inaccessible in bulk, nor is there a unified system to access it, outside of the Federal Courts. Clerks of courts refused Premonition requests for case data. Resolved to go about it the hard way, Unwin … wrote a web crawler to mine courthouse web sites for the data, read it, then analyze it in a database. …

Many publications run “Top Lawyer” lists, people who are recognized by their peers as being “the best”. Premonition analyzed the win rates of these attorneys, it turned out most were average. The only way that they stood out was a disproportionate number of appealed and re-opened cases, i.e. they were good at dragging out litigation. They discovered that even the law firms themselves were poor at picking litigators. In a study of the United Kingdom Court of Appeals, it found a slight negative correlation of -0.1 between win rates and re-hiring rates, i.e. a barrister 20% better than their peers was actually 2% less likely to be re-hired! … Premonition was formed in March 2014 and expected to find a fertile market for their services amongst the big law firms. They found little appetite and much opposition. …

The system found an attorney with 22 straight wins before the judge – the next person down was 7. A bit of checking revealed the lawyer was actually a criminal defense specialist who operated out of a strip mall. … The firm claims such outliers are far from rare. Their web site … shows an example of an attorney with 32 straight wins before a judge in Orange County, Florida. (more)

As a society we supposedly coordinate in many ways to make medicine and law more effective, such as via funding med research, licensing professionals, and publishing legal precedents. Yet we don’t bother to coordinate to create track records for docs or lawyers, and in fact our public representatives tend to actively block such things. And strikingly: customers don’t much care. A politician who proposed to dump professional licensing would face outrage, and lose. A politician who proposed to post public track records would instead lose by being too boring.

On reflection, these examples are part of a larger pattern. For example, I’ve mentioned before that a media firm had a project to collect track records of media pundits, but then abandoned the project once it realized that this would reduce reader demand for pundits. Readers are instead told to pick pundits based on their wit, fame, and publication prestige. If readers really wanted pundit track records, some publication would offer them, but readers don’t much care.

Attempts to publish track records of school teachers based on students outcomes have produced mostly opposition. Parents are instead encouraged to rely on personal impressions and the prestige of where the person teaches or went to school. No one even considers doing this for college teachers, we at most just survey student satisfaction just after a class ends (and don’t even do that right).

Regarding student evaluations, we coordinate greatly to make standard widely accessible tests for deciding who to admit to schools. But we have almost no such measures of students when they leave school for work. Instead of showing employers a standard measure of what students have learned, we tell employers to rely on personal impressions and the prestige of the school from which the student came. Some have suggested making standard what-I-learned tests, but few are interested, including employers.

For researchers like myself, publications and job position are measures of endorsements by prestigious authorities. Citations are a better measure of the long term impact of research on intellectual progress, but citations get much less attention in evaluations of researchers. Academics don’t put their citation count on their vita (= resume), and when a reporter decides which researcher to call, or a department decides who to hire, they don’t look much at citations. (Yes, I look better by citations than by publications or jobs, and my prestige is based more on the later.)

Related is the phenomenon of people being more interested in others said to have the potential to achieve X, than in people who have actually achieved X. Related also is the phenomenon of firms being reluctant to use formulaic measures of employee performance that aren’t mediated mostly by subjective boss evaluations.

It seems to me that there are striking common patterns here, and I have in mind a common explanation for them. But I’ll wait to explain that in my next post. Till then, how do you explain these patterns? And what other data do we have on how we treat track records elsewhere?

Added 22Mar: Real estate sales are also technically in the public record, and yet it is hard for customers to collect comparable sales track records for real estate agents, and few seem to care enough to ask for them.

Six Struggles Surrounding Status

Struggle For Function – These are struggles that individuals and organizations have to achieve non-status things. Keep a boat from sinking, don’t burn dinner, or have a wedding plan work out. Your status may depend on how well you do these things.

Struggle For Allies – You want to be liked by, and allied with, particular people that you encounter. Your status may help, but if you succeed they may well end up liking you much more than similar-status others.

Struggle For Status – Status is a widely shared estimate of esteem, within some status community. It is a weighted average of widely shared estimates of many admirable features. It combines direct power (dominance) and indirect power (prestige).

Struggle For Fashion – The weights of status vary with time, like fashions do. People form coalitions to push for more weight onto favorable features. In part via pushing to put their people into particular positions of power.

Struggle For Worlds – Status is usually not global, but instead relative to a community. These communities compete for influence in a wider world. People may care about how their local status struggles influences who wins there.

Struggle Over Struggles – All these struggles compete for the attention of individuals and organizations. Individuals, allies, coalitions, and communities can try to influence which struggles matter most.

The struggles for status, and for the fashion that sets status, tend to be zero-sum, at least directly. But the struggle for function clearly allows for mutual gains and higher efficiency. To a lessor extend, so does the struggle for allies; in principle we really can all have more lovers, friends and co-workers.

So if status puts more weight on function and allies, that can give added encouragement to attend to those struggles. And when people in a community care more about the struggle for worlds, they will want to put more status weight on such things. Especially on the kinds of functions and allies that most help win struggles over worlds.

It is also possible for a community to put less weight on status. For example, when status is the only visible quality marker re lawyers and doctors, customers must use it to pick those experts. But if customers can see visible track records, or use strong incentive contracts to pay for results, their status matters less. That can help to promote such functions, and also help a community to win the struggle over worlds.

Personally, I’m most engaged by the struggle over struggles. I’d like function and allies to matter more. And by reminding people of this struggle over worlds, I hope to influence the struggle for fashion to put more status weight on function, allies, and worlds. Yes, if I personally did better at the struggle for status, I could have more influence over the struggle for fashion. But at this point in my life, the opportunity cost of that seems quite high.

So I’ll content myself for now to point all this out to you, my readers. And invite you to join me in pushing to make status matter less, and to put more weight on function and allies. Such as via more trials with, and fewer legal barriers to, using track records and incentive contracts to substitute for status in picking experts.

Science 2.0

Skepticism … is generally a questioning attitude or doubt towards one or more items of putative knowledge or belief or dogma. It is often directed at domains, such as the supernatural, morality (moral skepticism), theism (skepticism about the existence of God), or knowledge (skepticism about the possibility of knowledge, or of certainty). (More)

Humans have long had many possible sources for our beliefs about the physical world. These include intuitive folk physics, sacred scriptures, inherited traditions, traveler stories, drug-induced experiences, gadget sales pitches, and expert beliefs within various professions. And for a very long time, we paid the most attention to the highest status sources, even if they were less reliable. This encouraged gullibility; we often believed pretty crazy stuff, endorsed by the high status.

One ancient high status group was astronomers, whose status was high because their topic was high – the sky above. It so happened that astronomers naturally focused on a small number of very standard parameters of wide interest: the sky positions of planets and comets (anything that moved relative to the stars). Astronomers often gained status by being better able to predict these positions, and for this purpose they found it useful to: (1) collect and share careful records on past positions, (2) master sufficient math to precisely describe past patterns, and (3) use those patterns to predict future parameter values.

For a long time astronomy seemed quite exceptional. Most other domains of interest seemed to have too much fuzziness, change, and variety to support a similar approach. What can you usefully measure while walking through a jungle? What useful general patterns can simple math describe there? But slowly and painfully, humans learned to identify a few relatively stable focal parameters of wide interest in other domains as well. First in physics: velocity, weight, density, temperature, pressure, toughness, heat of reaction, etc. Then in dozens of practical domains.

With such standard focal parameters in hand, domain experts also gained status by being able to predict future parameter values. As a result, they also learned that it helped to carefully collect shared systematic data, and to master sufficient math to capture their patterns.

And thus beget the scientific revolution, which helped beget the industrial revolution. A measurement revolution starting in astronomy, moving to physics, and then invading dozens of industrial domains. As domains acquired better stable focal parameters to observe, and better predictions, many such domains acquired industrial power. That is, those who had mastered such things could create devices and plans of greater social value. This raised the status of such domain experts, so that eventually this “scientific” process acquired high status: carefully collecting stable focal parameters, systematically collecting and sharing data on them, and making math models to describe their patterns. “Science” was high status.

One way to think about all this is in terms of the rise of skepticism. If you allow yourself to doubt if you can believe what your sources tell you about the physical world, your main doubt will be “who can I trust?” To overcome such doubt, you’ll want to focus on a small number of focal parameters, and for those seek shared data and explicit math models. That is, data where everyone can check how the data is collected, or collect it themselves, with redundant records to protect against tampering, and explicit shared math models describing their patterns. That is, you will turn to the methods to which those astronomers first turned.

Which is all to say that the skeptics turned out to be right. Not the extreme skeptics who doubted their own eyes, but the more moderate ones, who doubted holy scriptures and inherited traditions. Our distant ancestors were wrong (factually, if not strategically) to too eagerly trust their high status sources, and skeptics were right to focus on the few sources that they could most trust, when inclined toward great doubt. Slow methodical collection and study of the sort of data of which skeptics could most approve turned out to be a big key to enabling humanity’s current levels of wealth and power.

For a while now, I’ve been exploring the following thesis: this same sort of skepticism, if extended to our social relations, can similarly allow a great extension of our “scientific” and “industrial” revolutions, making our social systems far more effective and efficient. Today, we mainly use prestige markers to select and reward the many agents who serve us, instead of more directly paying for results or following track records. If asked, many say we do this because we can’t measure results well. But as with the first scientific revolution, with work we can find ways to coordinate to measure more stable focal parameters, sufficient to let us pay for results. Let me explain.

In civilization, we don’t do everything for ourselves. We instead rely on a great many expert agents to advise us and act for us. Plumbers, cooks, bankers, fund managers, manufacturers, politicians, contractors, reporters, teachers, researchers, police, regulators, priests, doctors, lawyers, therapists, and so on. They all claim to work on our behalf. But if you will allow yourself to doubt such claims, you will find plenty of room for skepticism. Instead of being as useful as they can, why don’t they just do what is easy, or what benefits them?

We don’t pay experts like doctors or lawyers directly for results in improving our cases, and we don’t even know their track records in previous cases. But aside from a few “bad apples”, we are told that we can trust them. They are loyal to us, coming from our nation, city, neighborhood, ethnicity, gender, or political faction. Or they follow proper procedures, required by authorities.

Or, most important, they are prestigious. They went to respected schools, are affiliated with respected institutions, and satisfied demanding licensing criteria. Gossip shows us that others choose and respect them. If they misbehave then we can sue them, or regulators may punish them. (Though such events are rare.) What more could we want?

But of course prestige doesn’t obviously induce a lawyer to win our case or promote justice, nor a doctor to make us well. Or a reporter to tell us the truth. Yes, it is logically possible that selecting them on prestige happens to also max gains for us. But we rarely hear any supporting argument for such common but remarkable claims; we are just supposed to accept them because, well, prestigious people say so.

Just as our distant ancestors were too gullible (factually, if not strategically) about their sources of knowledge on the physical world around them, we today are too gullible on how much we can trust the many experts on which we rely. Oh we are quite capable of skepticism about our rivals, such as rival governments and their laws and officials. Or rival professions and their experts. Or rival suppliers within our profession. But without such rivalry, we revert to gullibility, at least regarding “our” prestigious experts who follow proper procedures.

Yes, it will take work to develop better ways to measure results, and to collect track records. (And supporting math.) But progress here also requires removing many legal obstacles. For example, trial lawyers all win or lose in public proceedings, records of which are public. Yet it is very hard to actually collect such records into a shared database; many sit in filing cabinets in dusty county courthouse basements.

Contingency fees are a way to pay lawyers for results, but they are illegal in many places. Bounty hunters are paid for results in catching fugitives, but are illegal in many places. Bail bonds give results incentives to those who choose jail versus freedom, but they are being made illegal now. And so on. Similarly, medical records are more often stored electronically, but medical ethics rules make it very hard to aggregate them, and also to use creative ways to pay doctors based on results.

I’ve written many posts on how we could work to pay more for results, and choose more based on track records. And I plan to write more. But in this post I wanted to make the key point that what should drive us in this direction is skepticism about how well we can trust our usual experts, chosen mainly for their prestige (and loyalty and procedures) and using weak payment incentives. You might feel embarrassed by such skepticism, thinking it shows you to be low status and anti-social. After all, don’t all the friendly high status popular people trust their experts?

But the ancient skeptics were right about distrusting their sources on the physical world, and following their inclination helped to create science and industry, and our vast wealth today. Continuing to follow skeptical intuitions, this time regarding our expert agents, may allow us to create and maintain far better systems of law, medicine, governance, and much more. Onward, to Science 2.0!

Sycophantry Masquerading As Bargains

The Catholic Church used to sell “indulgences”; you gave them cash and they gave you the assurance that God would let you sin without punishment. If you are at all suspicious about whether this church can actually deliver on their claim, this seems a bad deal. You give them something tangible and clearly valuable, and they give you a vague promise on something you can’t see, and can’t even check if anyone has ever received.

We make similar bad “bargains” with a few kinds of workers, to whom we grant extraordinary privileges of “self-regulation.” That is, we let certain “professionals” run their own organizations which tell us how their job their job is to be done, and who can do it. In some areas, such as with doctors, these judgements are enforced by law: you can only buy medical services approved by doctors, and can only buy such services from those who the official medical organizations labels “doctors.” In other areas, such as with academics, these judgements are more enforced by our strong eagerness to associate with high prestige professionals: most everyone just accepts the word of key academic organizations on who is a good academic.

There is a literature which frames this as a “grand bargain”. The philosopher Donald Schön says:

In return for access to their extraordinary knowledge in matters of great human importance, society has granted them [professionals] a mandate for social control in their fields of specialization, a high degree of autonomy in their practice, and a license to determine who shall assume the mantle of professional authority.

In their book The Future of the Professions: How Technology Will Transform the Work of Human Experts, Richard and Daniel Susskind elaborate:

In acknowledgement of and in return for their expertise, experience, and judgement, which they are expected to apply in delivering affordable, accessible, up-to-date, reassuring, and reliable services, and on the understanding that they will curate and update their knowledge and methods, train their members, set and enforce standards for the quality of their work, and that they will only admit appropriately qualified individuals into their ranks, and that they will always act honestly, in good faith, putting the interests of clients ahead of their own, we (society) place our trust in the professions in granting them exclusivity over a wide range of socially significant services and activities, by paying them a fair wage, by conferring upon them independence, autonomy, rights of self-determination, and by according them respect and status.

Notice how in this supposed bargain, what we give the professionals is concrete and clearly valuable, while what they give us (over what we’d get without the deal) is vague and very hard for us to check. Like an indulgence. The Susskinds claim that while this bargain has been a good deal so far, we will soon cancel it:

We predict that increasingly capable machines, operating on their own or with non-specialist users, will take on many of the tasks that have been the historic preserve of the professions. We anticipate an ‘incremental transformation’ in the way that we produce and distribute expertise in society. This will lead eventually to a dismantling of the traditional professions.

This seems seriously mistaken to me. There is actually no bargain, there is just the rest of us submitting to professionals’ prestige. Cheaper yet outcome-effective substitutes to expensive professionals have long been physically available, and yet we have mostly not chosen those substitutes due to our eagerness to affiliate with prestigious professionals. We don’t choose nurses who can do primary care as well as doctors, and we don’t watch videos of the best professors from which we could learn as much as from attending typical lectures in person. And we aren’t interested in outcome track records for our lawyers. The existence of even more such future substitutes won’t change this situation much.

Bowing To Elites

Imagine that that you are a politically savvy forager in a band of size thirty, or a politically savvy farmer near a village of size thousand. You have some big decisions to make, including who to put in various roles, such as son-in-law, co-hunter, employer, renter, cobbler, or healer. Many people may see your choices. How should you decide?

Well first you meet potential candidates in person and see how much you intuitively respect them, get along with them, and can agree on relative status. It isn’t enough for you to have seen their handiwork, you want to make an ally out of these associates, and that won’t work without respect, chemistry, and peace. Second, you see what your closest allies think of candidates. You want to be allies together, so it is best if they also respect and get along with your new allies.

Third, if there is a strong leader in your world, you want to know what that leader thinks. Even if this leader says explicitly that you can do anything you like, they don’t care, if you get any hint whatsoever that they do care, you’ll look closely to infer their preferences. And you’ll avoid doing anything they’d dislike too much, unless your alliance is ready to mount an overt challenge.

Fourth, even if there is no strong leader, there may be a dominant coalition encompassing your band or town. This is a group of people who tend to support each other, get deference from others, and win in conflicts. We call these people “elites.” If your world has elites, you’ll want to treat their shared opinions like those of a strong leader. If elites would gossip disapproval of a choice, maybe you don’t want it.

What if someone sets up objective metrics to rate people in suitability for the roles you are choosing? Say an archery contest for picking hunters, or a cobbler contest to pick cobblers. Or public track records of how often healer patients die, or how long cobbler shoes last. Should you let it be known that such metrics weigh heavily in your choices?

You’ll first want to see what your elites or leader think of these metrics. If they are enthusiastic, then great, use them. And if elites strongly oppose, you’d best only use them when elites can’t see. But what if elites say, “Yeah you could use those metrics, but watch out because they can be misleading and make perverse incentives, and don’t forget that we elites have set up this whole other helpful process for rating people in such roles.”

Well in this case you should worry that elites are jealous of this alternative metric displacing their advice. They like the power and rents that come from advising on who to pick for what. So elites may undermine this metric, and punish those who use it.

When elites advise people on who to pick for what, they will favor candidates who seem loyal to elites, and punish those who seem disloyal, or who aren’t sufficiently deferential. But since most candidates are respectful enough, elites often pick those they think will actually do well in the role. All else equal, that will make them look good, and help their society. While their first priority is loyalty, looking good is often a close second.

Since humans evolved to be unconscious political savants, this is my basic model to explain the many puzzles I listed in my last post. When choosing lawyers, doctors, real estate agents, pundits, teachers, and more, elites put many obstacles in the way of objective metrics like track records, contests, or prediction markets. Elites instead suggest picking via personal impressions, personal recommendations, and school and institution prestige. We ordinary people mostly follow this elite advice. We don’t seek objective metrics, and instead use elite endorsements, such as the prestige of where someone went to school or now works. In general we favor those who elites say have the potential to do X, over those who actually did X.

This all pushes me to more favor two hypotheses:

  1. We choose people for roles mostly via evolved mental modules designed mainly to do well at coalition politics. The resulting system does often pick people roughly well for their roles, but more as a side than a direct effect.
  2. In our society, academia reigns as a high elite, especially on advice for who to put in what roles. When ordinary people see another institution framed as competing directly with academia, that other institution loses. Pretty much all prestigious institutions in our society are seen as allied with academia, not as competing with it. Even religions, often disapproved by academics, rely on academic seminary degrees, and strongly push kids to gain academic prestige.

We like to see ourselves as egalitarian, resisting any overt dominance by our supposed betters. But in fact, unconsciously, we have elites and we bow to them. We give lip service to rebelling against them, and they pretend to be beaten back. But in fact we constantly watch out for any actions of ours that might seem to threaten elites, and we avoid them like the plague. Which explains our instinctive aversion to objective metrics in people choice, when such metrics compete with elite advice.

Added 8am: I’m talking here about how we intuitively react to the possibility of elite disapproval; I’m not talking about how elites actually react. Also, our intuitive reluctance to embrace track records isn’t strong enough to prevent us from telling specific stories about our specific achievements. Stories are way too big in our lives for that. We already norms against bragging, and yet we still manage to make our selves look good in stories.

