Reward or Punish?

May 31, 2013

Many reality TV shows, like Project Runway, Hell’s Kitchen, or Survivor, focus on punishing the worst, instead of rewarding the best. Not only do viewers seem to find that more interesting, it actually works better to incentivize performance (many quotes below). Punishment works better to encourage lone behavior, to encourage behavior in a group, and as a tool for letting some group members encourage others.

The puzzle is that in most of our social worlds we instead focus on rewarding the best, not punishing the worst. If you search for “punish reward” you will mostly find the issue raised about how to treat kids; we are mainly willing to use punishment flexibly on them. And this when young kids are the main exception – for them punishment works worse. For adults, we tend to limit punishment’s use to extreme behavior that we all strongly agree is bad, like crime. And when you ask adults, they much prefer to be part of a group that uses rewards, not punishment.

As a college teacher, I expect that I’d get more effort from most students by regularly pointing out the worst student in the class than the best. But I also expect students to hate it and give me low evaluations. Similarly, I expect that if I wrote the occasional post criticizing a bad blog commenter here, instead of praising a good one, I’d get more change in commenting behavior. But I also expect that person to complain long and loud about how I was biased and unfair, and others to come to their defense. I expect a lot less complaining about bias in picking the best.

In both the class and comment cases, I expect people to see me as mean and cruel for punishing the worst, but kind and generous for rewarding the best. This even though all of these effects are relative – punishment would raise the rest of the class, or the rest of the commenters, up above the worse.

Note that rewarding the best is in practice more elitist than punishing the worse; punishing creates an underclass, not an overclass. And in fact our hyper-egalitarian forager ancestors were quite reluctant to overtly reward or praise; they focused their social coordination on having the group punish norm violators. Our hyper sensitivity to being punished, and our elaborate instinctual strategies to give excuses and to coordinate to retaliate against any who might suggest we should be punished, are probably human adaptations to that forager history. And they make us especially unwilling to accept punishment by an authority, instead of by the informal consensus of the group.

This seems an interesting example of our seeking to avoid aspects of the forager way of life. Our forager evolved aversion to being singled out for social shame is so strong that we’d rather create elites instead. At least this applies when we are relatively rich and comfortable. If we really feared being destroyed for lack of sufficient efforts, as farmers often did, we’d probably be a lot more eager to raise overall efforts by punishing the worse. I suspect that foragers themselves didn’t punish much in good times; punishment was invoked more, and mattered more, in hard times. In good times foragers probably more tolerated praising some as better, and weak forms of bragging.

In a more competitive future, with organizations and individuals that compete harder to survive, I’d expect more use of punishment, in addition to reward.

Today if you have a group that really needs to succeed, and to induce strong efforts all around, consider paying the social disruptions costs of punishing the worst, instead of rewarding the best. You will probably get more effort that way, even if people end up hating you and calling you evil for it. And if your group doesn’t punish and fails, know that your reluctance to punish was probably a contributing factor.

Those promised quotes:

A study of 150 public-school teachers … split the teachers into two groups and told both that their bonuses would be linked to student test scores. Teachers in the first would receive a bonus at the end of the year if student test scores improved. Members of the second group received a check for $4,000 in September and agreed to return the money if test scores failed to rise by June. Loss aversion worked: Teachers who faced the threat of having to refund their bonuses produced student test scores that were about 7 percentage points higher. (more)

Performance improved substantially more when the feedback was positive in the case of the younger children, telling them they did well when they did, rather than negative, telling them that they did poorly when they did. Just the opposite proved true in the case of older children, who functioned just like young adults aged 18-25 who were also tested. That is, negative feedback improved performance more for these individuals than did positive feedback. (more)

In a laboratory experiment, we have implemented three main incentive mechanisms: reward-only, punishment-only, and reward and punishment. We have also varied the size of the tournament. Although the baseline model predicts that employee effort should be the same in all treatments, our empirical results have indicated that this is not the case. In general, no mechanism generates higher effort levels from the agents than the one which combines reward and punishment. We have also found that punishment produces similar results to the combined mechanism in tournaments of a relatively small size (three participants), while the reward-only and punishment-only mechanisms are equivalent in terms of effort in tournaments of a relatively large size (six participants). … Social dilemma … experiments have [also] found that punishment is generally more effective than rewards in terms of promoting cooperation between subjects in voluntary contribution games. (more)

It’s much easier to change an established behavior by offering rewards, rather than threatening with punishments. … But once you have established a norm, sustaining it by the threat of punishment is cheap. Only a few people will violate the norm, so you will rarely have to follow through with your threat. (more)

[We] extend the standard public goods game in a variety of ways, in particular by allowing for endogenous preference over institution, … [We] elicit preferences in an incentive-compatible manner over voluntary contribution mechanisms with and without reward and punishment options. Finally, [we] randomly assign subjects to one of the four institutions and observe repeated play. They [we] that payoffs are significantly greater when punishment is allowed but that only a small minority of participants prefers such an environment. (more)

29 Comments

Overcoming Bias Commenter

May 15, 2023

I believe rewarding the best is a modern concept relative to punishing the worst. An old timer once believed in "punishing the worst" because people shouldn't be rewarded for what is already expected. Rewards should only be given in highly exceptional circumstances.

Expand full comment

I'll remark that this is obviously dumb from a statistical learning perspective; someone who comes in 9th of 10th place when the 10-place person is a previous high performer and the whole team is stacked with high performers, is doing pretty well. SVD with a single factor would catch this!

27 more comments...