Seeking Super Factors

In a factor analysis, one takes a large high-dimensional dataset and finds a low dimensional set of variables that can explain as much as possible of the total variation in that dataset. A big advantage of factor analysis is that it doesn’t require much theoretical knowledge about the nature of the variables in the data or their relations – factors are mostly determined directly by the data.

Factor analysis has had some big successes in helping us to understand how humans differ. As many people know, intelligence is the main factor explaining variation in cognitive test performance, ideology is the main factor explaining variations in political positions, and personality types explain much of the variation in stable attitudes and temperament. These factors have allowed us to greatly advance our understanding of intelligence, ideology, and personality, even while remaining ignorant of their fundamental causes and natures.

However, people vary in far more ways than intelligence, ideology, and personality, and factor analyses have been applied to many of these other human feature categories. For example, there have been factors analyses of jobs, brands, faces, body shape, gait, accent, diet, clothing, writing styleleisure behavior, friendship networks, sleep habitsphysical health, mortality, demography, national cultures, and zip codes.

As my last post on media genre factors showed, factors found in different feature categories are often substantially correlated with one another. This suggests that if we put together a huge super-dataset describing many individual people in as many ways as possible, a factor analysis of this dataset may find important new super-factors that span many of these features domains. Such super-factors would be promising candidates to use in a wide range of social research, and social policy.

Now it remains logically possible that these super-factors will end up being simple linear combinations of the factors that we have already found in each of these feature categories. Maybe we already know most of what there is to know about how humans vary. But I’d bet strongly and heavily against this. The rate at which we have been learning new things about how humans vary doesn’t remotely suggest we’ve run out of new big things to learn. Yes, merely knowing the super-factors isn’t the same as understanding their origins. But just as we’ve seen with factor analysis in more specific areas, knowing the main factors can be a big help.

So I’d guess that the super-factors found in a super dataset of human details will be revolutionary developments. We will afterward see uncovering them as a seminal milestone in our progress in understanding human variation. A Nobel prize worthy level of seminality. All it will take is lots of tedious work to collect a super dataset, and then do some straightforward number crunching. A quest awaits; who will rise to the challenge?

Failed Singularity Model

Noted Yale economist William Nordhaus has a new paper “Are We Approaching an Economic Singularity? Information Technology and the Future of Economic Growth”:

Assume that labor is constant, that all technological change is capital-augmenting at 10% per year, and that the elasticity of substitution between labor and information capital is 1.25. Figure 3 shows a typical simulation of the share of capital and the growth rates of output and wages.

NordhausSingularityIn this model, capital slowly gets a larger share of total income and the economic growth accelerates, even though the rate of innovation never changes. Nordhaus lists six empirical predictions for the sign of observed parameters, and finds that four of the six are rejected by our best estimates having the opposite sign. And this doesn’t include the fact that our best estimates find the elasticity of substitution between labor and capital to be less than one. The two sign predictions that match the data suggest it would take a century or more before growth rates exceed 20% per year. Nordhaus says, “The conclusion is therefore that the growth Singularity is not near.”

Of course this is far from the only possible economic model of a singularity. But it sets a good standard for future efforts. Can anyone find a concrete simple economic model of singularity that better fits the data?

The Data We Need

Almost all research into human behavior focuses on particular behaviors. (Yes, not extremely particular, but also not extremely general.) For example, an academic journal article might focus on professional licensing of dentists, incentive contracts for teachers, how Walmart changes small towns, whether diabetes patients take their medicine, how much we spend on xmas presents, or if there are fewer modern wars between democracies. Academics become experts in such particular areas.

After people have read many articles on many particular kinds of human behavior, they often express opinions about larger aggregates of human behavior. They say that government policy tends to favor the rich, that people would be happier with less government, that the young don’t listen enough to the old, that supply and demand is a good first approximation, that people are more selfish than they claim, or that most people do most things with an eye to signaling. Yes, people often express opinions on these broader subjects before they read many articles, and their opinions change suspiciously little as a result of reading many articles. But even so, if asked to justify their more general views academics usually point to a sampling of particular articles.

Much of my intellectual life in the last decade has been spent in the mode of collecting many specific results, and trying to fit them into larger simpler pictures of human behavior. So both I and the academics I’m describing above in essence present themselves as using these many results presented in academic papers about particular human behaviors as data to support their broader inferences about human behavior. But we do almost all of this informally, via our vague impressionistic memories of what has been the gist of the many articles we’ve read, and our intuitions about what more general claims seem how consistent with those particulars.

Of course there is nothing especially wrong with intuitively matching data and theory; it is what we humans evolved to do, and we wouldn’t be such a successful species if we couldn’t at least do it tolerably well sometimes. It takes time and effort to turn complex experiences into precise sharable data sets, and to turn our theoretical intuitions into precise testable formal theories. Such efforts aren’t always worth the bother.

But most of these academic papers on particular human behaviors do in fact pay the bother to substantially formalize their data, their theories, or both. And if it is worth the bother to do this for all of these particular behaviors, it is hard to see why it isn’t be worth the bother for the broader generalizations we make from them. Thus I propose: let’s create formal data sets where the data points are particular categories of human behavior.

To make my proposal clearer let’s for now restrict attention to explaining government regulatory policies. We could create a data set where the datums are particular kinds of products and services that governments now provide, subsidize, tax, advise, restrict, etc. For such datums we could start to collect features about them into a formal data set. Such features could say how long that sort of thing has been going on, how widely it is practiced around the world, how variable has been that practice over space and time, how familiar are ordinary people today with its details, what sort of justifications do people offer for it, what sort of emotional associations do people have with it, how much do we spend on it, and so on. We might also include anything we know about how such things correlate with age, gender, wealth, latitude, etc.

Generalizing to human behavior more broadly, we could collect a data set of particular behaviors, many of which seem puzzling at least to someone. I often post on this blog about puzzling behaviors. Each such category of behaviors could be one or more data points in this data set. And relevant features to code about those behaviors could be drawn from the features we tend to invoke when we try to explain those behaviors. Such as how common is that behavior, how much repeated experience do people have with it, how much do they get to see about the behavior of others, how strong are the emotional associations, how much would it make people look bad to admit to particular motives, and so on.

Now all this is of course much easier said than done. Is it a lot of work to look up various papers and summarize their key results as entries in this data set, or just to look at real world behaviors and put them into simple categories. It is also work to think carefully about how to usefully divide up the space of actions and features. First efforts will no doubt get it wrong in part, and have to be partially redone. But this is the sort of work that usually goes into all the academic papers on particular behaviors. Yes it is work, but if those particular efforts are worth the bother, then this should be as well.

As a first cut, I’d suggest just picking some more limited category, such as perhaps government regulations, collecting some plausible data points, making some guesses about what useful features might be, and then just doing a quick survey of some social scientists where they each fill in the data table with their best guesses for data point features. If you ask enough people, you can average out a lot of individual noise, and at least have a data set about what social scientists think are features of items in this area. With this you could start to do some exploratory data analysis, and start to think about what theories might well account for the patterns you see.

Now one obvious problem with my proposal is that while it looks time consuming and tedious, it isn’t obviously impressive. Researchers who specialize in particular areas will complain about your data entries related to their areas, and you won’t be able to satisfy them all. So you will end up with a chorus of critics saying your data is all wrong, and your efforts will look too low brow to cower them with your impressive tech. So I can see why this hasn’t been done much. Even so, I think this is the data set we need.

Imagine Libertopia

In this post I’ll talk primarily to people who, like me, lean libertarian. The rest of you can take a break.

Libertarians want to move more products and services from being provided directly by government, to being provided privately. And for those that are provided privately, libertarians want to weaken regulations. These changes would increase liberty.

Libertarians tend to offer arguments that are relatively abstract and theory-based. That is, they focus more on why more liberty is more moral, or why it should in theory give better outcomes. They focus less on showing that liberty has in practice worked out better. When libertarians do focus on data, they tend to be very broad, or randomly specific. That is, they talk about how West Germany is better than East Germany, or South Korea better than North Korea. Or they pick on very specific examples, like regulations limiting eyeglass ads, and leave audiences wondering how cherry-picked are such examples.

It seems to me that libertarians focus too much on trying to argue abstractly that liberty would be better, and not enough on just concretely describing how liberty would be different. Yes for you the abstract arguments seem best; they persuade you plenty, and they bring the most prestige in your circle. But typical libertarians today are a distinct personality type; most people are not like you. Most people just cannot be comfortable with a proposal for change if they cannot imagine it in some detail, and imagine that they’d like that detail. Such people don’t need more abstract arguments and examples; they instead credible concrete descriptions.

True, people have sometimes written fiction set in libertarian settings. But such fiction doesn’t usually come with a careful analysis of why one should believe in its many details. Yes, part of the attraction of liberty is that it frees up people to innovate in ways that one can’t anticipate in advance. But that doesn’t mean that we can’t go a long way to better describe a world of more liberty.

On reflection, I realize that when I try to imagine more liberty, I mostly draw on a limited set of iconic comparisons, such as comparing airlines, trucks, and phones before and after US deregulation, or comparing public to private schools and mail in the US. Alas, we and our audiences should worry that we cherry-pick such examples to support conclusions we like.

We should be able to do much better than this. By now there are vast literatures discussing many industries in many places before and after regulation or deregulation, and describing specific times and places where certain products and services provided directly by governments, or provided privately. From this vast literature we should be able to identify many concrete patterns and “stylized facts” about how government-provision and heavy-regulation tends to change products and services.

I recall these suggestions for typical features of industries with more liberty:

  1. Less “gold-plating” in materials and methods
  2. More product variety, including more low quality versions
  3. Faster innovation and product cycles
  4. Fewer guarantees to workers or customers
  5. Price, features vary more with customer features
  6. Workers have less school and seniority
  7. Less overhead spend on paperwork
  8. more?

Some people should work to extract patterns like these from our vast related literatures – I’ve looked, and there just aren’t many such summaries today. With such patterns in hand, we would be in a much better position to credibly describe how familiar products and services would concretely change if we were to provide them privately, or to regulate them less. And such credible concrete descriptions might allow many more people to become comfortable with endorsing such expansions of liberty.

This sort of project seems well within the abilities of the median grad student. It doesn’t require great creativity or technical skills. Instead, it just requires methodically surveying and summarizing related literatures. Perhaps some libertarian students should shy away from it in hopes of impressing via more difficult methods. But surely there must be other students for which this sort of project is a good match.

The Why-Policy-Wiki

“Normative as positive” (NAP) — explaining that the [education] policies actually chosen were chosen because they maximize an individualized social welfare function — fails as a useful general positive model of schooling. While NAP can perhaps accommodate the fact of some direct production of schooling by some governments, the reality is that (nearly) all governments produce education and that, by and large, this is their only support to education. Moreover, NAP fails not just in the large but also the small: there are six additional common facts about educational policies inconsistent with NAP. (more; HT Bryan Caplan)

That is Lant Pritchett, and I share his frustration. People usually explain their government’s policies via scenarios wherein such policies would help the world, or at least their local region. But when you point out details at odds with such simple stories, such people are usually uninterested in the subject. They switch to suggesting other scenarios or problems where policy might help, also with little interest in the details.

This evasive style, i.e., the habit of pointing to a diffuse space of possible scenarios and problems instead of particular ones, is a huge obstacle to critics. If you put a lot of time in critiquing one story, people just note that there are lots of other possible stories you didn’t critique.

This style helps people maintain idealist attitudes toward institutions they like. In contrast, people do the opposite for institutions they dislike, such as rival foreign governments or profit-making firms. In those cases, people prefer cynical explanations. For example, people say that firms advertise mainly to fool folks into buying products they don’t need. But the evasion remains; if you critique one cynical explanation they switch to others, avoiding discussing details about any one.

To solve this evasion problem, I propose we create a new kind of wiki that surveys opinions on policy explanations. In this new wiki readers could find items like ” 68% (162/238) of college graduates say the best explanation of government running schools, instead of subsidizing them, is because educated citizens can pay more taxes to benefit other citizens. 54% (7/13) of economics PhDs surveyed say it is to push propaganda.”

Here is how it would work. There would be three category hierarchies: of policies, of policy explanations, and of people with opinions on policy explanations. Each hierarchy would include a few very general categories near the top, and lots of much more specific categories toward the bottom.

Anyone could come to the wiki to contribute opinions on policy explanations. They would first give some demographic info on themselves, and that info would put them somewhere in the category hierarchy of people. They could then browse the category hierarchy of policies, picking a policy to explain. Finally, they could browse the category hierarchy of explanations, picking their favored explanation of that policy.

Users could start by being shown the most common explanation offered so far for similar policies by similar people, and then browsing away from that. Users could also expand the category hierarchies, to add more specific policies and explanations. For particular policy explanation pairs, users might add links to relevant theory, evidence, and arguments. Users might also upvote links added by others. This would help later readers search for well-voted evidence and theory close in the hierarchies to any given policy explanation.

By using category hierarchies, a wide range of people could express a wide range of opinions. Experts could dive into details while those who can barely understand the most basic categories could gesture crudely in their favored directions. Given such a wiki, a critic could focus their efforts on the most popular explanations for a policy by their target audience, and avoid the usual quick evasion to other explanations. Prediction markets tied to this wiki could let people bet that particular explanations won’t hold up well to criticism, or that popular opinion on a topic will drift toward a certain sort of expert opinion.

Of course this wiki could and should also be used to explain common policies of firms, clubs, families, and even individuals. I expect some editorial work to be needed, to organize sensible category hierarchies. But if good editors start the system with good starting hierarchies, the continuing editorial work probably wouldn’t be prohibitive.

Why National Med?

People offer many noble rationales for public education, but the data suggest they were adopted to create patriotic citizens for war. I suspect a similar data analysis could show why so many nations have recently adopted national medical systems:

Even as Americans debate … Obama’s healthcare law and its promise of guaranteed health coverage, … many far less affluent nations are moving in the opposite direction – to provide medical insurance to all nations.

China … is on track to .. cover more than 90 percent of the nation’s residents. … Two decades ago, many former communist countries … dismantled their universal health-care systems amid a drive to set up free-market economies. but popular demand for insurance protection has fueled an effort in nearly all these countries to rebuild their systems. Similar pressure is coming from the citizens of fast-growing nations int Asia and Latin America. …

Some countries have set up public systems like those in Great Britain and Canada. But many others are relying on a mix of government and commercial insurance, as in the United States. …

In countries such as India, politicians have learned that one of the surest says to secure votes is to promise better access to health care.  … The Thai system, set up a decade ago, has survived years of political upheaval and a military coup. “No party dares touch it.” …

Columbia’s universal system, set up in 1993, has cost more than twice what as expected.  (Today’s Post, article by Levey, p. A11; link will go here when available)

My guess: for our distant ancestors, medicine was a way to show that they care about each other. So today there is a demand for medicine to be provided by units of organization toward which we, or they, want us to feel solidarity. But I’m not sure what are the most direct and proximate causes of such a need for solidarity.

Academic Blog Credit?

Martin Weller considers academic credit for blogs:

The answer to … whether new approaches such as blogging constitute scholarly activity, is an emphatic yes. Which leads us to a more problematic question: How should we recognize it? …

Tenure committees have increasingly come to rely upon journal-impact factors to act as a proxy for research quality. In short, we know what a good publication record looks like. But these criteria begin to creak and groan when we apply them to blogs and other online media. Simple metrics are subject to gaming, and because of the removal of the peer-review filter, may be meaningless anyway. I may have a YouTube clip of a skateboarding octopus with two million hits, but that doesn’t make it scholarly work.

It’s a difficult problem, but one that many institutions are beginning to come to terms with. Combining the rich data available online that can reveal a scholar’s impact with forms of peer assessment gives an indication of reputation. Universities know this is a game they need to play—that having a good online reputation is more important in recruiting students than a glossy prospectus. And groups that sponsor research are after good online impact as well as presentations at conferences and journal papers.

… I’ve found that since becoming a blogger, I publish fewer journal articles, so it has had a “negative” impact on that aspect of my academic life. However, it has led to so many other unpredictable benefits—such as the establishment of a global peer network that helps me stay up to date with my topic, increased research collaboration, and more invitations to give talks—that it’s been worth the trade-off. (more)

Yes universities care about getting good and much “press”, but they are not willing to tenure professors merely for getting good press. The self-concept of professors only lets them give at most a minor weight to press, and sometimes the weight is negative.

The key difference is between getting attention vs. making impressive original intellectual contributions. Being cited by major news media, or having so many blog readers, can credential you as getting attention. But so far only journal articles, Ph.D. theses, and certain books and conference papers are accepted as credentials for impressive original intellectual contributions. For these, high quality experts are seen to judge the intellectual contribution.

Yes blog posts can contain impressive original intellectual contributions. Newspaper columns can contain them as well. So can speeches. Even spontaneous party conversations can contain them. The problem is, we don’t have systems set up for experts to evaluate these things in such terms. And if an intellectual contribution isn’t credentialed as such by academic experts, then it basically doesn’t exist as far as academia is concerned.

So either blogs will be continue to be seen mainly as a way to get “press” attention, or some folks will develop a system of expert evaluation of the intellectual contribution of blog posts. And as with academic journals, the main obstacle to doing that is: getting sufficiently prestigious academies to spend enough time doing their evaluations.

Now it turns out that many prestigious academic already read a lot of blog posts. So one approach would be to create a special “review” section where only prestigious academics can enter quick reviews of blog posts they read. Perhaps these reviews would be anonymous to the blog author and readers, and a more centralized part of the system would weigh their prestige, and degree of topical expertise, to compute a post evaluation.

But even with lots of new whizzy software support, it isn’t obvious you’d get enough reviews to make it work. People write reviews of journal articles in part because they hope doing so will favorably dispose editors toward their later submissions. People who write reviews of blog posts couldn’t have that motivation.

Big Changes Test Econ

For students motivated by grades, ways to teach are limited by ways to test. No matter how insightful your lectures, grade-focused students will only attend to what it takes to pass your tests. So better ways to test give better ways to teach.

Economics is usually taught as a series of abstract concepts, and such concepts are usually tested in the abstract as well. Alas, this encourages neglect of how to apply abstract concepts to concrete situations.

Yes, one can instead ask concrete questions, about what would happen in the world if a specific social change were made. For example: what would happen if we discovered a huge new oil field? Problem is, there are many ways to guess at specific consequences without using abstract economic theory. People come to economics with lots of complex intuitions about how the social world works, or how it should work, so if you ask them specific questions they tend to use such intuitions instead of abstract concepts.

For example, lots of questions about changes within the usual range of experience can be answered merely by projecting observed trends, or by making analogies to similar situations. Yes, these are reasonable ways to guess at social consequences, but they can get in the way of assimilating economic concepts. Yes, tests can reward using abstract concepts, but even so it can be hard to get students into that habit.

A related problem is that small changes seem to have limited consequences. One might notice a few immediate consequences, but indirect consequences seem to quickly fade into “pretty much no effect” on more distant parts of society.

So to teach (and test) students to really apply economic concepts, it helps to consider concrete changes well outside of their usual range of experience. For changes that are big and dramatic enough, students can see the inadequacy of analogy and trend projection, and so are willing to look to abstract theory. And big changes more obviously have distant indirect effects.

My post yesterday on Tube Earth Econ was an example. If I ask you to estimate the social consequences of your planet having a very different shape, it is hard to make analogies to similar past events. That will push you more to go back to basic concepts and work from there.

I apply this concept in my masters level microeconomics course. One quarter of the grade is for this assignment:

Big Change Paper – Imagine a single big change to our economy, and then use microeconomics to describe the consequences of that change, including whether those changes are good or bad. You might, for example, imagine how things would be changed if people became immortal, if Star Trek style “transporters” were available, if very reliable lie detectors were available, or if no one needed sleep.

We talk about the consequences of similar big changes in class, to give students examples to follow in their papers. Such discussions and assignments are especially fun for people like me who enjoy science fiction and the drama of thinking about big dramatic social changes.

Consulate Care

Here’s another idea for medical reform: consulate care. Let countries like Sweden, France, etc. with approved national health care systems have bigger consulates, and open them up to paying customers for medical services. For example, you could sign up for Swedish Care, and when needed you’d go to their consulate to get medical care as if you were living in Sweden.

Now we might not approve consulate care for say North Korea or Uganda, but surely most developed nations are good enough. We don’t issue travel warnings suggesting people not travel to Sweden, for fear of getting sick there. So why not let folks travel to a Sweden nearby for their medical care?

Since most other nations spend far less than the US on medicine, consulate care should be a lot cheaper. And since those other nations seem to suffer no net health loss from their cheaper care, consulate care should be no less healthy.

Continuous Cooperation

In a prisoner’s dilemma, two sides have an incentive to defect, even though mutual defection is worse for both sides than mutual cooperation. It is well known that in theory and in reality people cooperate more when then expect to interact over more repetitions, and when they care more about the future.

It is hard to make people live longer, or care more about the future. It can be just as helpful, however, and often much easier, to make people interact more frequently. In the limit of continuous interaction, people should cooperate the most. My once co-author Ryan Oprea has a paper with Daniel Friedman in the latest AER, showing this:

We study [lab experiment] prisoners’ dilemmas played in continuous time with flow payoffs accumulated over 60 seconds. In most cases, the median rate of mutual cooperation is about 90%. Control sessions with repeated matchings over 8 subperiods achieve less than half as much cooperation, and cooperation rates approach zero in one-shot control sessions.

They introduce some new theory to explain details of this behavior:

Inspired by a strand of existing theoretical literature, we postulated a particular class of epsilon equilibria and derived formulas predicting how cooperation rates respond to adjustment lags and to payoff parameters. These predictions accounted well for the Continuous, Grid-8 and (trivially) One-Shot data. They also nicely explained a set of second-round data from Grid-n sessions, which varied the number of subperiods from 2 to 60. Thus the formulas correctly predict defection in one shot games, cooperation in continuous time and intermediate results on the path between the two. The underlying intuition is simple. When your opponent can react very quickly, defecting from mutual cooperation is likely to earn you the temptation paypoff only briefly and may cost you the cooperation payoff for the rest of the period.

So do online firms cooperate more when they can vary their prices more frequently? What rapidly-changeable actions would help nations to cooperate more?

