Reward or Punish?

Many reality TV shows, like Project Runway, Hell’s Kitchen, or Survivor, focus on punishing the worst, instead of rewarding the best. Not only do viewers seem to find that more interesting, it actually works better to incentivize performance (many quotes below). Punishment works better to encourage lone behavior, to encourage behavior in a group, and as a tool for letting some group members encourage others.

The puzzle is that in most of our social worlds we instead focus on rewarding the best, not punishing the worst. If you search for “punish reward” you will mostly find the issue raised about how to treat kids; we are mainly willing to use punishment flexibly on them. And this when young kids are the main exception – for them punishment works worse. For adults, we tend to limit punishment’s use to extreme behavior that we all strongly agree is bad, like crime. And when you ask adults, they much prefer to be part of a group that uses rewards, not punishment.

As a college teacher, I expect that I’d get more effort from most students by regularly pointing out the worst student in the class than the best. But I also expect students to hate it and give me low evaluations. Similarly, I expect that if I wrote the occasional post criticizing a bad blog commenter here, instead of praising a good one, I’d get more change in commenting behavior. But I also expect that person to complain long and loud about how I was biased and unfair, and others to come to their defense. I expect a lot less complaining about bias in picking the best.

In both the class and comment cases, I expect people to see me as mean and cruel for punishing the worst, but kind and generous for rewarding the best. This even though all of these effects are relative – punishment would raise the rest of the class, or the rest of the commenters, up above the worse.

Note that rewarding the best is in practice more elitist than punishing the worse; punishing creates an underclass, not an overclass. And in fact our hyper-egalitarian forager ancestors were quite reluctant to overtly reward or praise; they focused their social coordination on having the group punish norm violators. Our hyper sensitivity to being punished, and our elaborate instinctual strategies to give excuses and to coordinate to retaliate against any who might suggest we should be punished, are probably human adaptations to that forager history. And they make us especially unwilling to accept punishment by an authority, instead of by the informal consensus of the group.

This seems an interesting example of our seeking to avoid aspects of the forager way of life. Our forager evolved aversion to being singled out for social shame is so strong that we’d rather create elites instead. At least this applies when we are relatively rich and comfortable. If we really feared being destroyed for lack of sufficient efforts, as farmers often did, we’d probably be a lot more eager to raise overall efforts by punishing the worse. I suspect that foragers themselves didn’t punish much in good times; punishment was invoked more, and mattered more, in hard times. In good times foragers probably more tolerated praising some as better, and weak forms of bragging.

In a more competitive future, with organizations and individuals that compete harder to survive, I’d expect more use of punishment, in addition to reward.

Today if you have a group that really needs to succeed, and to induce strong efforts all around, consider paying the social disruptions costs of punishing the worst, instead of rewarding the best. You will probably get more effort that way, even if people end up hating you and calling you evil for it. And if your group doesn’t punish and fails, know that your reluctance to punish was probably a contributing factor.

Those promised quotes:

A study of 150 public-school teachers … split the teachers into two groups and told both that their bonuses would be linked to student test scores. Teachers in the first would receive a bonus at the end of the year if student test scores improved. Members of the second group received a check for $4,000 in September and agreed to return the money if test scores failed to rise by June. Loss aversion worked: Teachers who faced the threat of having to refund their bonuses produced student test scores that were about 7 percentage points higher. (more)

Performance improved substantially more when the feedback was positive in the case of the younger children, telling them they did well when they did, rather than negative, telling them that they did poorly when they did. Just the opposite proved true in the case of older children, who functioned just like young adults aged 18-25 who were also tested. That is, negative feedback improved performance more for these individuals than did positive feedback. (more)

In a laboratory experiment, we have implemented three main incentive mechanisms: reward-only, punishment-only, and reward and punishment. We have also varied the size of the tournament. Although the baseline model predicts that employee effort should be the same in all treatments, our empirical results have indicated that this is not the case. In general, no mechanism generates higher effort levels from the agents than the one which combines reward and punishment. We have also found that punishment produces similar results to the combined mechanism in tournaments of a relatively small size (three participants), while the reward-only and punishment-only mechanisms are equivalent in terms of effort in tournaments of a relatively large size (six participants). … Social dilemma … experiments have [also] found that punishment is generally more effective than rewards in terms of promoting cooperation between subjects in voluntary contribution games. (more)

It’s much easier to change an established behavior by offering rewards, rather than threatening with punishments. … But once you have established a norm, sustaining it by the threat of punishment is cheap. Only a few people will violate the norm, so you will rarely have to follow through with your threat. (more)

[We] extend the standard public goods game in a variety of ways, in particular by allowing for endogenous preference over institution, … [We] elicit preferences in an incentive-compatible manner over voluntary contribution mechanisms with and without reward and punishment options. Finally, [we] randomly assign subjects to one of the four institutions and observe repeated play. They [we] that payoffs are significantly greater when punishment is allowed but that only a small minority of participants prefers such an environment. (more)

GD Star Rating
Tagged as: , , ,
Trackback URL:
  • PhilBowermaster

    As a management strategy I would expect this to work quite well for driving performance in a one-off project or any kind of temporary situation — which is what reality shows and lab experiments equate to — but having your employees hate you and think you’re evil will take its toll over time. If nothing else, the high performers will probably go elsewhere seeking positive incentives for performance, which will still be available in most other employment situations (unless everybody starts punishing at the same time.) Then you’re stuck with low performers who hate you and who are therefore already probably pretty disengaged; not exactly a recipe for success.

    I wonder whether this dynamic might have played a role in the nearly exclusive focus on reward that we see in the workplace today? One shop started rewarding top performers, which enabled them to attract top performers from other shops, and eventually everyone else had to follow suit.

    • lemmycaution

      The companies that do this drive their employees nuts. It apparently did serous damage to microsoft:

      At the center of the cultural problems was a management system called “stack ranking.” Every current and former Microsoft employee I interviewed—every one—cited stack ranking as the most destructive process inside of Microsoft, something that drove out untold numbers of
      employees. The system—also referred to as “the performance model,” “the bell curve,” or just “the employee review”—has, with certain variations
      over the years, worked like this: every unit was forced to declare a certain percentage of employees as top performers, then good performers, then average, then below average, then poor.

      “If you were on a team of 10 people, you walked in the first day knowing that, no matter how good everyone was, two people were going to get a great review,
      seven were going to get mediocre reviews, and one was going to get a terrible review,” said a former software developer. “It leads to employees focusing on competing with each other rather than competing with other companies.”

      • Perhaps the problem is that people are not measured by an independent norm (which they could all pass), but against each other (so by definition someone always fails). Also it creates bad incentives — you don’t want to work with the best people, because it will make *you* the worst one in the team (and I guess the punishment for being the worst in the team was greater than the reward for being a member of a good team).

        I would call it a misunderstanding and a bad implementation of the rule. Punishing people who violate tribe norms can result in a happy equilibrium where everyone follows the norms. Cooperation in a Prisonners’ Dilemma! But there is no such happy solution in Microsoft… and the people are well aware of it.

      • Thariinye

        However, Robin’s post is indeed about ‘punishing the worst.’ That’s not ‘punish the bad,’ but the worst. That is exactly Stack Ranking.

        It comes about because there isn’t really any objective measure that can easily be used for most of these things, that actually gets at what should be done. ‘Objective’ Teacher evaluations are generally shit (not measuring teaching and pressuring towards test-taking), and the same goes for evaluations of creativity etc. in a company, especially when it’s team-based work. Thus, we get subjective rankings, which have the various problems that Stack Ranking had.

        Stack Ranking and various ‘punish the worst’ evaluation mechanisms are terrible for morale and overall effectiveness, however they’re great entertainment. You get to see fear, backbiting, betrayal, backstabbing, naked ambition, and all these other things that make for interesting stories. Reality TV shows use mechanisms similar to Stack Ranking precisely because they’re completely terrible at encouraging people to get things done in a reasonable and cooperative manner. Cooperation is bad TV.

      • Reward the best has exactly the same problem of only looking at relative performance. But people like it a lot more.

      • Eliezer Yudkowsky

        I’ll remark that this is obviously dumb from a statistical learning perspective; someone who comes in 9th of 10th place when the 10-place person is a previous high performer and the whole team is stacked with high performers, is doing pretty well. SVD with a single factor would catch this!


    Couple of things:

    1) when you reward the best no one really loses: there is a winner and everyone else who are united in their averageness, no one is downtrodden. When you punish the worst he will be downtrodden, unless you repeat the exercise so often that punishment becomes something most people have gone through, but that’s clearly not the case for job promotions, reality shows or student grades

    2) punishing the worst may simply create a culture where people are trying to be not the worst, try to stay just above a certain baseline and put in no more effort than required to stay there

    3) fear can motivate people to work harder, but that often means working dumber as well

    4) I think the best system is one where the rewards aren’t too big, the rewards are spread: reward the top 3, not just the best, otherwise people will feel a reward to be impossible to reach, so they revert back to what I wrote in 2), especially if it’s a process with a low number of repetitions (and therefore large variance which means results will be more down to chance and less to ability), and the punishments not too big either (unless they maliciously screwed up) and also spread it to more than one person at the bottom, punishments should be foregone entirely if the number of repetitions is extremely low.

    • IMASBA

      Why is a downtrodden “underclass” a problem you may ask? It may not be a problem for your project right now, but you are externalizing a lot of anger and resentment. If some unreasonably harsh punishment system leaves people unemployable for making one little mistake then the rest of society has to pay for that. In addition people can develop mental problems because of excessive punishment, and everyone working on your projects is in increased danger of burn out and stress-related problems, especially if you make a habit of labelling your projects as important enough to justify the social disruption cost, a habit which will be encouraged by your superiors who expect you to deliver the same result you delivered the first time (they may fire you if you try to explain your team can’t repeat that result every time, getting fired means you pay the price, if you instead keep ypur mouth shut it is your employees who pay the price).

    • Stephen Diamond

      1. I don’t understand why no one loses. Everyone but the winner loses status, which is a positional good. Either way, people lose; like Hanson says, one way creates an underclass and the other an overclass.

      2. If everyone is competing not to be the worst, you can’t rest on your laurels. I absolutely refuse to view a single reality tv show, but I hear enough about it to get the idea.

      3. I think your third point is correct and of key importance; Hanson and his sources (which I haven’t fully read) seem to miss the point. It’s a hoary law of psychology that the optimal level of arousal depends on the complexity of the task. Punishment is too arousing for intellectual tasks.

      4. The optimal system of reward and punishment (from the point of view of the administrator) must depend not only on the objective (as Robin suggests) but also on the nature of the task: thus with the development of technology, reward becomes more effective for many tasks. The next step is to gradually eliminate material rewards and punishments altogether, as they detract from high-level performance and distort agents’ goals. (See Belief–opinion confusion and the contradictions of capitalist investment markets: Fictional-market socialism .)

      • IMASBA

        “Everyone but the winner loses status, which is a positional good.”

        Everyone but the winner(s) loses some theoretical quantity, but this isn’t money we’re talking about, shared status “feels” better. This is why people bond in bootcamp. It doesn’t feel bad to be “average” or a winner, it does feel bad to be a loser.

      • IMASBA

        To further explain my first point: if no one is punished the pain is spread out and may even cease to be pain because finding yourself among peers is pleasant. If one person is punished that person gets destroyed (lose job, become unemployable, etc…). People in a system without punishment also do not have to fear the crippling damage you get from being singled out for punishment. I think this is similar to how there’s the same average status in a society with slavery as there is in one without slavery, but people prefer living in a society without slavery.

        “But you can rest on your laurels if the task is complex and you’re really smart. Perhaps another reason technological development disfavors punishing poor performance.”

        Exactly, plus you can coordinate so the whole group has to put in minimal effort, the advantage of this is obvious while it will only go wrong when a significant number of people violates the agreement.

  • Ted Sanders

    If you eliminate the worst contestant each week, the show will last 16 weeks.

    If you reward the best contestant each week, the show will last 1 week.

    I think this argument explains why TV shows focus on punishing/eliminating players.

    • IMASBA

      “If you reward the best contestant each week, the show will last 1 week.”

      No (and this is a crucial point in general),one week/round is not enough to establish who’s really the best, only after sufficient repetition will you find out who has beginner’s luck and who is the best.

      Tying huge rewards or punishments to low-occurence events rewards/punishes people who really don’t deserve it because there have been insufficient repetitions to determine who’s the best/worst. People intuitively pick up on this and lose faith in the system, reducing their performance.

      • Ted Sanders

        Good point. Nonetheless, a best of 15 tournament will also have an unpredictable length. It’s highly unlikely that the 15th episode will determine the winner. So I think TV networks would still prefer the punishing loser model

      • IMASBA

        Yes, it gives them better ratings, but it’s not really a nice or honest system. It’s made even worse by the fact that they only allow one winner (even if the runner up is much better than last year’s winner). This reminds me of how Charlie Chaplin himself once lost a Charlie Chaplin look alike contest.

      • There’s a bit more to the Charlie Chaplain story… the contest was to imitate Chaplain’s famous “Tramp” character, and, unlike the winner, Chaplain didn’t have a costume.

      • IMASBA

        Surely the inability of the jury (all fans I presume) to recognize Charlie Chaplin or acknowledge his perfect mannerisms and appearance is a blow to the idea of unique superstars, or at least our ability to recognize such talents and that is a warning to not award too much to a single champion or small group of champions. Ayn Rand’s “producers” are a myth, yet capitalism is largely based on a similar idea…

  • Doug

    At a hedge fund I worked at a few years ago there was a policy of firing the bottom 10% of employees every year. The unofficial motto was “We’re the best, because we fire the worst.”

    Predictably the firm garnered an outside reputation for being an extremely competitive, cutthroat, unpleasant environment.

    • I can’t resist linking to a memorable scene in the movie Glengarry Glen Ross, where the character played by Alec Baldwin announces the company’s decision to fire the seller with the fewest sales.

  • Pingback: Recomendaciones | intelib()

  • Jonathan Graehl

    I’m not sure why you included “have your bonus now but we’ll claw it back if you don’t meet targets”. Because a loss feels like a punishment (and that’s why a loss feels 2x worse than a gain feels good)? Or the converse?

  • Christian Kleineidam

    I think most online forums are moderated in a way where the worst posters get banned and there isn’t that much to reward the good posters.

    • James Oswald

      On my blog, I reward good commenters with replies, and delete the terrible ones, so it’s two sided.

  • Alpheus

    Based on the comments, it seems as if it might be helpful to distinguish explicitly between punishments involving a high degree of shame (e.g., being the lone contestant ejected from a reality show) and punishment where the role of shame isn’t so great (e.g., having to return a bonus or being denied a possible raise). I suspect most of the bad effects of punishment as a motivator/management technique can be greatly reduced if shame isn’t a large part of the punishment.

  • mugasofer

    On the other hand, as long as people know they’re comfortably not the worst, there’s much less incentive to excel. Presumably this is usually outweighed by the fact that punishing is cheaper for the same effect, but it’s worth noting.

  • My thoughts after a really quick look at the quotes and their origins:

    First study was looking at whether knowing that you will lose something that you have or knowing that you might potentially win something works better. This is far far different than looking at whether being rewarded or being punished produces better results in subsequent behavior (which is what the post argues for). You can see the difference I assume?

    After looking at the second study it seems like they found that negative feedback starts to activate the brain more as you get older (with a similar strength as positive feedback) but did not find that negative feedback produced better results as the article that you quote claims.

    Third study doesn’t argue your point. I guess the closest thing is ‘Social dilemma … experiments have [also] found that punishment is generally more effective than rewards in terms of promoting cooperation between subjects in voluntary contribution games.’ This is the only condition under which arguably punishment has real benefits (according to the study) over rewards and forgive me but are you really talking about ‘ promoting cooperation between subjects in voluntary contribution games’? Seems to me that the situations where you argue that punishment is better than reward are not of this nature nor is this the benefit that you are looking for.

    4th study hasn’t convinced me either but I cannot be bothered to search for all the studies (since you’ve linked to articles describing them instead of linking to the papers) so I will accept it. However, again the study’s findings not fully applicable in the situations which you are describing

    The 5th study only barely argues your point as well.

  • James Oswald

    The additional marginal product of effort is FAR higher on the highest performing employees. Ideally, you’d want to incentivize each person to work hard, but let’s suppose for some reason you can either reward at the top, or punish at the bottom. Rewarding at the top encourages the top 10% or so to try to be in the top 1%, and punishing at the bottom encourages the bottom 10% not to be the bottom 1%. In the first case, the total gain to productivity for the firm is very high, but in the latter case, the gain is neglible. You don’t even need morale in the model.

    • IMASBA

      “The additional marginal product of effort is FAR higher on the highest performing employees.”

      That really depends on the specifics.

      • James Oswald

        I agree, but in my experience it’s quite common.

  • Gil

    I believe rewarding the best is a modern concept relative to punishing the worst. An old timer once believed in “punishing the worst” because people shouldn’t be rewarded for what is already expected. Rewards should only be given in highly exceptional circumstances.

  • Pingback: The Marketplace of Ideas | Daily Economy()

  • Pingback: When Darkness Falls, Heroes Must Rise. Law Librarians Are Positioned to be Those Heroes. | On Firmer Ground()

  • Pingback: Does reward better motivate children than punishment? | Human Psychology()