# Tag Archives: Math

Practical men, who believe themselves to be quite exempt from any intellectual influences, are usually slaves of some defunct economist. (Keynes)

Many have recently said 1) US industries have become more concentrated lately, 2) this is a bad thing, and 3) inadequate antitrust enforcement is in part to blame. (See many related MR posts.)

I’m teaching grad Industrial Organization again this fall, and in that class I go through many standard simple (game-theoretic) math models about firms competiting within industries. And occurs to me to mention that when these models allow “free entry”, i.e., when the number of firms is set by the constraint that they must all expect to make non-negative profits, then such models consistently predict that too many firms enter, not too few. These models suggest that we should worry more about insufficient, not excess, concentration.

Two examples:

• “Cournot” Quantity Competition Firms pay (the same) fixed cost to enter an industry, and (the same) constant marginal cost to make products there. Knowing the number of firms, each firm simultaneously picks the quantity it will produce. The sum of these quantities is intersected with a linear demand curve to set the price they will all be paid for their products.
• “Circular City” Differentiated Products Customers are uniformly distributed, and firms are equally distributed, around a circle. Firms pay (the same) fixed cost to enter, and (the same) constant marginal cost to serve each customer. Each firm simultaneously sets its price, and then each customer chooses the firm from which it will buy one unit. This customer must pay not only that firm’s price, but also a “delivery cost” proportional to its distance to that firm.
• [I also give a Multi-Monopoly example in my next post.]

In both of these cases, when non-negative profit is used to set the number of firms, that number turns out to higher than the number that maximizes total welfare (i.e., consumer value minus production cost). This is true not only for these specific models I’ve just described, but also for most simple variations that I’ve come across. For example, quantity competition might have increasing marginal costs, or a sequential choice of firm quantity. Differentiated products might have a quadratic delivery cost, allow price discrimination by consumer location, or have firms partially pay for delivery costs.

Furthermore, we have a decent general account that explains this general pattern. It is a lot like how there is typically overfishing if new boats enter a fishing area whenever they expect a non-negative profit per boat; each boat ignores the harm it does to other boats by entering. Similarly, firms who enter an industry neglect the costs they impose on other firms already in that industry.

Yes, I do know of models that predict too few firms entering each industry. For example, a model might assume that all the firms who enter an industry go to war with each other via an all-pay auction. The winning firm is the one who paid the most, and gains the option to destroy any other firm. Only one firm remains in the industry, and that is usually too few. However, such models seem more like special cases designed to produce this effect, not typical cases in the space of models.

I’m also not claiming that firms would always set efficient prices. For example, a sufficiently well-informed regulator might be able to improve welfare by lowering the price set by a monopolist. But that’s about the efficiency of prices, not of the number of firms. You can’t say there’s too much concentration even with a monopolist unless the industry would actually be better with more than one firm.

Of course the world is complex and space of possible models is vast. Even so, it does look like the more natural result for the most obvious models is insufficient concentration. That doesn’t prove that this is in fact the typical case in the real world, but it does at least raise a legitimate question: what theory model do people have in mind when they suggest that we now have too much industry concentration? What are they thinking? Can anyone explain?

Added 11a: People sometimes say the cause of excess concentration is “barriers to entry”. The wikipedia page on the concept notes that most specific things “cited as barriers to entry … don’t fit all the commonly cited definitions of a barrier to entry.” These include economies of scale, cost advantages, network effects, regulations, ads, customer loyalty, research, inelastic demand, vertical integration, occupational licensing, mergers, and predatory pricing. Including these factors in models does not typically predict excess concentration.

That wiki page does list some specific factors as fitting “all the common definitions of primary economic barriers to entry.” These include IP, zoning, agreements with distributors and suppliers, customers switching costs, and taxes. But I say that models which include such factors also do not consistently predict excess firm concentration. And I still want to know which of these factors complainers have in mind as the source of the recent increased US concentration problem that they see.

Added 7Sep: Many have in mind the idea that regulations impose fixed costs that are easier on larger firms. But let us always agree that it would be good to lower costs. Fixed costs are real costs, and can’t be just assumed away. If you know a feasible way to actually lower such costs, great let’s do that, but that’s not about excess concentration, that’s about excess costs.

GD Star Rating
Tagged as: ,

## Non-Conformist Influence

Here is a simple model that suggests that non-conformists can have more influence than conformists.

Regarding a one dimensional choice x, let each person i take a public position xi, and let the perceived mean social consensus be m = Σiwixi, where wi is the weight that person i gets in the consensus. In choosing their public position xi, person i cares about getting close to both their personal ideal point ai and to the consensus m, via the utility function

Ui(xi) = -ci(xi-ai)2 – (1-ci)(xi-m)2.

Here ci is person i’s non-conformity, i.e., their willingness to have their public position reflect their personal ideal point, relative to the social consensus. When each person simultaneously chooses their xi while knowing all of the ai,wi,ci, the (Nash) equilibrium consensus is

m = Σi wiciai (ci + (1-ci)(1-wi))-1 (1- Σjwj(1-cj)(1-wj)/(cj + (1-cj)(1-wj)))-1

If each wi<<1, then the relative weight that each person gets in the consensus is close to wiciai. So how much their ideal point ai counts is roughly proportional to their non-conformity ci times their weight wi. So all else equal, non-conformists have more influence over the consensus.

Now it is possible that others will reduce the weight wi that they give the non-conformists with high ci in the consensus. But this is hard when ci is hard to observe, and as long as this reduction is not fully (or more than fully) proportional to their increased non-confomity, non-conformists continue to have more influence.

It is also possible that extremists, who pick xi that deviate more from that of others, will be directly down-weighted. (This happens in the weights wi=k/|xi-xm| that produce a median xm, for example.) This makes more sense in the more plausible situation where xi,wi are observable but ai,ci are not. In this case, it is the moderate non-conformists, who happen to agree more with others, who have the most influence.

Note that there is already a sense in which, holding constant their weight wi, an extremist has a disproportionate influence on the mean: a 10 percent change in the quantity xi – m changes the consensus mean m twice as much when that quantity xi – m is twice as large.

GD Star Rating
Tagged as: ,

## High Dimensional Societes?

I’ve seen many “spatial” models in social science. Such as models where voters and politicians sit at points in a space of policies. Or where customers and firms sit at points in a space of products. But I’ve never seen a discussion of how one should expect such models to change in high dimensions, such as when there are more dimensions than points.

In small dimensional spaces, the distances between points vary greatly; neighboring points are much closer to each other than are distant points. However, in high dimensional spaces, distances between points vary much less; all points are about the same distance from all other points. When points are distributed randomly, however, these distances do vary somewhat, allowing us to define the few points closest to each point as that point’s “neighbors”. “Hubs” are closest neighbors to many more points than average, while “anti-hubs” are closest neighbors to many fewer points than average. It turns out that in higher dimensions a larger fraction of points are hubs and anti-hubs (Zimek et al. 2012).

If we think of people or organizations as such points, is being a hub or anti-hub associated with any distinct social behavior?  Does it contribute substantially to being popular or unpopular? Or does the fact that real people and organizations are in fact distributed in real space overwhelm such things, which only only happen in a truly high dimensional social world?

GD Star Rating
Tagged as:

## Chip Away At Hard Problems

Harold: Such as it is.
C: What’s wrong with it?
H: The big ideas aren’t there.
C: Well, it’s not about big ideas. It’s… It’s work. You got to chip away at a problem.
C: I think it was, in a way. I mean, he’d attack a problem from the side, you know, from some weird angle. Sneak up on it, grind away at it.
(Lines from movie Proof; Catherine is a famous mathematician’s daughter.)

In math, plausibility arguments don’t count for much; proofs are required. So math folks have little choice but to chip away at hard problems, seeking weird angles where indirect progress may be possible.

Outside of math, however, we usually have many possible methods of study and analysis. And a key tradeoff in our methods is between ease and directness on the one hand, and robustness and rigor on the other. At one extreme, you can just ask your intuition to quickly form a judgement that’s directly on topic. At the other extreme, you can try to prove math theorems. In between these extremes, informal conversation is more direct, while statistical inference is more rigorous.

When you need to make an immediate decision fast, direct easy methods look great. But when many varied people want to share an analysis process over a longer time period, more robust rigorous methods start to look better. Easy direct easy methods tend to be more uncertain and context dependent, and so don’t aggregate as well. Distant others find it harder to understand your claims and reasoning, and to judge their reliability. So distant others tend more to redo such analysis themselves rather than building on your analysis.

One of the most common ways that wannabe academics fail is by failing to sufficiently focus on a few topics of interest to academia. Many of them become amateur intellectuals, people who think and write more as a hobby, and less to gain professional rewards via institutions like academia, media, and business. Such amateurs are often just as smart and hard-working as professionals, and they can more directly address the topics that interest them. Professionals, in contrast, must specialize more, have less freedom to pick topics, and must try harder to impress others, which encourages the use of more difficult robust/rigorous methods.

You might think their added freedom would result in amateurs contributing proportionally more to intellectual progress, but in fact they contribute less. Yes, amateurs can and do make more initial progress when new topics arise suddenly far from topics where established expert institutions have specialized. But then over time amateurs blow their lead by focusing less and relying on easier more direct methods. They rely more on informal conversation as analysis method, they prefer personal connections over open competitions in choosing people, and they rely more on a perceived consensus among a smaller group of fellow enthusiasts. As a result, their contributions just don’t appeal as widely or as long.

I must admit that compared to most academics near me, I’ve leaned more toward amateur styles. That is, I’ve used my own judgement more on topics, and I’ve been willing to use less formal methods. I clearly see the optimum as somewhere between the typical amateur and academic styles. But even so, I’m very conscious of trying to avoid typical amateur errors.

So instead of just trying to directly address what seem the most important topics, I instead look for weird angles to contribute less directly via more reliable/robust methods. I have great patience for revisiting the few biggest questions, not to see who agrees with me, but to search for new angles at which one might chip away.

I want each thing I say to be relatively clear, and so understandable from a wide range of cultural and intellectual contexts, and to be either a pretty obvious no-brainer, or based on a transparent easy to explain argument. This is partly why I try to avoid arguing values. Even so, I expect that the most likely reason I will fail is that that I’ve allowed myself to move too far in the amateur direction.

GD Star Rating
Tagged as: , ,

## Elite Evaluator Rents

The elite evaluator story discussed in my last post is this: evaluators vary in the perceived average quality of the applicants they endorse. So applicants seek the highest ranked evaluator willing to endorse them. To keep their reputation, evaluators can’t consistently lie about the quality of those they evaluate. But evaluators can charge a price for their evaluations, and higher ranked evaluators can charge more. So evaluators who, for whatever reason, end up with a better pool of applicants can sustain that advantage and extract continued rents from it.

This is a concrete plausible story to explain the continued advantage of top schools, journals, and venture capitalists. On reflection, it is also a nice concrete story to help explain who resists prediction markets and why.

For example, within each organization, some “elites” are more respected and sought after as endorsers of organization projects. The better projects look first to get endorsement of elites, allowing those elites to sustain a consistently higher quality of projects that they endorse. And to extract higher rents from those who apply to them. If such an organization were instead to use prediction markets to rate projects, elite evaluators would lose such rents. So such elites naturally oppose prediction markets.

For a more concrete example, consider that in 2010 the movie industry successfully lobbied the US congress to outlaw the Hollywood Stock Exchange, a real money market just then approved by the CFTC for predicting movie success, and about to go live. Hollywood is dominated by a few big studios. People with movie ideas go to these studios first with proposals, to gain a big studio endorsement, to be seen as higher quality. So top studios can skim the best ideas, and leave the rest to marginal studios. If people were instead to look to prediction markets to estimate movie quality, the value of a big studio endorsement would fall, as would the rents that big studios can extract for their endorsements. So studios have a reason to oppose prediction markets.

While I find this story as stated pretty persuasive, most economists won’t take it seriously until there is a precise formal model to illustrate it. So without further ado, let me present such a model. Math follows. Continue reading "Elite Evaluator Rents" »

GD Star Rating
Tagged as: , ,

## Rank-Linear Utility

Just out in Management Science, a very simple, general, and provocative empirical theory of real human decisions: in terms of time or money or any other quantity, utility is linear in rank among recently remembered similar items. There is otherwise no risk-aversion or time-discounting, etc. This makes sense of a lot of data. Details:

We present a theoretical account of the origin of the shapes of utility, probability weighting, and temporal discounting functions. In an experimental test of the theory, we systematically change the shape of revealed utility, weighting, and discounting functions by manipulating the distribution of monies, probabilities, and delays in the choices used to elicit them. The data demonstrate that there is no stable mapping between attribute values and their subjective equivalents. Expected and discounted utility theories, and also their descendants such as prospect theory and hyperbolic discounting theory, simply assert stable mappings to describe choice data and offer no account of the instability we find. We explain where the shape of the mapping comes from and, in describing the mechanism by which people choose, explain why the shape depends on the distribution of gains, losses, risks, and delays in the environment. …

People behave as if the subjective value of an amount, risk, or delay is given by its rank position in the context created by other recently experienced amounts, risks, and delays. … To summarize the above studies, people behave as if the subjective value of an amount (or probability or delay) is determined, at least in part, by its rank position in the set of values currently in a person’s head. So, for example, \$10 has a higher subjective value in the set \$2, \$5, \$8, and \$15 because it ranks 2nd, but has a lower subjective value in the set \$2, \$15, \$19, and \$25 because it ranks 4th. …

Rather than supporting a change in the shape of a utility, weighting, or discounting function, or a change in the primitives which people process, our data suggest that the whole enterprise of using stable functions to translate between objective and subjective values should be abandoned. …. There is no method which gives, even with careful counterbalancing, the true level of risk aversion or the true shape of a utility function. In any given situation, one can observe choices and infer a shape or level of risk aversion. But as soon as the context changes—that is, as soon as the decision maker experiences any new amount—the measured shape or level of risk aversion will no longer apply. (more; ungated;also)

GD Star Rating
Tagged as: ,

## Math: Useful & Over-Used

Noah Smith … on the role of math in economics … suggests that it’s mainly about doing hard stuff to prove that you’re smart. I share much of his cynicism about the profession, but I think he’s missing the main way (in my experience) that mathematical models are useful in economics: used properly, they help you think clearly, in a way that unaided words can’t. Take the centerpiece of my early career, the work on increasing returns and trade. The models … involved a fair bit of work to arrive at what sounds in retrospect like a fairly obvious point. … But this point was only obvious in retrospect. … I … went through a number of seminar experiences in which I had to bring an uncomprehending audience through until they saw the light.

I am convinced that most economath badly fails the cost-benefit test. … Out of the people interested in economics, 95% clearly have a comparative advantage in economic intuition, because they can’t understand mathematical economics at all. …. Even the 5% gain most of their economic understanding via intuition. .. Show a typical economist a theory article, and watch how he “reads” it: … If math is so enlightening, why do even the mathematically able routinely skip the math? .. When mathematical economics contradicts common sense, there’s almost always mathematical sleight of hand at work – a sneaky assumption, a stilted formalization, or bad back-translation from economath to English. … Paul[‘s] … seminar audiences needed the economath because their economic intuition was atrophied from disuse. I can explain Paul’s models to intelligent laymen in a matter of minutes.

Krugman replies:

Yes, there’s a lot of excessive and/or misused math in economics; plus the habit of thinking only in terms of what you can model creates blind spots. … So yes, let’s critique the excessive math, and fight the tendency to equate hard math with quality. But in the course of various projects, I’ve seen quite a lot of what economics without math and models looks like — and it’s not good.

For most questions, the right answer has a simple intuitive explanation. The problem is: so do many wrong answers. Yes we also have intuitions for resolving conflicting intuitions, but we find it relatively easy to self-deceive about such things. Intuitions help people who do not think or argue in good faith to hold to conclusions that fit their ideology, and to not admit they were wrong.

People who instead argue using math are more often forced to admit when they were wrong, or that the best arguments they can muster only support weaker claims than those they made. Similarly, students who enter a field with mistaken intuitions often just do not learn better intuitions unless they are forced to learn to express related views in math. Yes, this typically comes at a huge cost, but it does often work.

We wouldn’t need as much to pay this cost if we were part of communities who argued in good faith. And students (like maybe Bryan) who enter a field with good intuitions may not need as much math to learn more good intuitions from teachers who have them. So for the purpose of drawing accurate and useful conclusions on economics, we could use less math if academics had better incentives for accuracy, such as via prediction markets. Similarly, we could use less math in teaching economics if we better selected students and teachers for good intuitions.

But  in fact academia research and teaching put a low priority on accurate useful conclusions, relative to showing off, and math is very helpful for that purpose. So the math stays. In fact, I find it plausible, though hardly obvious, that moving to less math would increase useful accuracy even without better academic incentives or student selection. But groups who do this are likely to lose out in the contest to seem impressive.

A corollary is that if you personally just want to better understand some particular area of economics where you think your intuitions are roughly trustworthy, you are probably better off mostly skipping the math and instead reasoning intuitively. And that is exactly what I’ve found myself doing in my latest project to foresee the rough outlines of the social implications of brain emulations. But once you find your conclusions, then if you want to seem impressive, or to convince those with poor intuitions to accept your conclusions, you may need to put in more math.

GD Star Rating
Tagged as: ,

## Inequality Math

Here is a distribution of aeolian sand grain sizes:

Here is a distribution of diamond sizes:

On a log-log scale like these, a power law is a straight line, while a lognormal distribution is a downward facing parabola. These distributions look like a lognormal in the middle with power law tails on either side.

Important social variables are distributed similarly, including the (people) size of firms:

and of cities:

In these two cases the upper tail follows Zipf’s law, with a slope very close to one, implying that each factor of two in size contains the same number of people. That is, there are just as many people in all the cities with 100,000 to 200,000 people as there are in all the cities with one million to two million people. (Since there are an infinite number of such ranges, this adds up to an infinite expected number of people in huge cities, but actual samples are finite.)

The double Pareto lognormal distribution models this via an exponential distribution over lognormal lifetimes. In a simple diffusion process, positions that start out concentrated at a point spread out into a normal distribution whose variance increases steadily with time. With a normal distribution over the point where this process started, and a constant chance in time of ending it, the distribution over ending positions is normal in the middle, but has fat exponential tails. And via a log transform, this becomes a lognormal with power-law tails.

This makes sense as a model of sizes for particles, firms, and cities when such things have widely (e.g., exponentially) varying lifetimes. Random collisions between grains chip off pieces, giving both a fluctuating drift in particle size and an exponential distribution of grain ages (since starting as a chip). Firms and cities also tend to start and die at somewhat constant rates, and to drift randomly in size.

In the math, a Zipf upper tail, with a power of near one, implies little local net growth of each item, so that size drift nearly counters birth and death rates. For example, if a typical thousand-person firm grows by 1% per year (with half growing slower and half growing faster than 1%), but has a 1% chance each year of dying (assuming no firms start at that size), it will keep the same expected number of employees. Such a firm has no local net growth.

Interestingly, individual wealth is distributed similarly. More on that in my next post.

GD Star Rating
Tagged as: ,

## Fixing Election Markets

One year from now the US will elect a new president, almost surely either a Republican R or a Democrat D. If there are US voters for whom politics is about policy, such voters should want to estimate post-election outcomes y like GDP, unemployment, or war deaths, conditional on the winning party w = R or D. With reliable conditional estimates E[y|w] in hand, such voters could then support the party expected to produce the best outcomes.

Sufficiently active conditional prediction markets can produce conditional estimates E[y|w] that are well-informed and resistent to biases and manipulation. One option is to make bets on y that are called off if w is not true. Another is to trade assets like  “Pays \$y if w” for assets like “Pays \$1 if w.” A basic problem this whole approach, however, is that simple estimates E[y|w] may reflect correlation instead of causation.

For example, imagine that voters prefer to elect Republicans when they see a war looming. In this case if y = war deaths then E[y|R] might be greater than E[y|D], even if Republicans actually cause fewer war deaths when they run a war. Wolfers and Zitzewitz discuss a similar problem in markets on which party nominees would win the election:

It is tempting to draw a causal interpretation from these results: that nominating John Edwards would have produced the highest Democratic vote share. …The decision market tells us that in the state of the world in which Edwards wins the nomination, he will also probably do well in the general election. This is not the same as saying that he will do well if, based on the decision market, Democrats nominate Edwards. (more)

However, this problem has a solution: conditional close-election markets — markets that estimate post-election outcomes conditional not only on which party wins, but also on the election being close. This variation not only allows a closer comparison between candidates’ causal effects on outcomes, but it is also more relevant to an outcome-oriented voter’s decision. After all, an election must be close in order for your vote to influence the election winner.

To show that conditional close markets estimate causality well, I’ll need to get technical. And use probability math. Which I do now; beware.

GD Star Rating