Seeking Super Factors

In a factor analysis, one takes a large high-dimensional dataset and finds a low dimensional set of variables that can explain as much as possible of the total variation in that dataset. A big advantage of factor analysis is that it doesn’t require much theoretical knowledge about the nature of the variables in the data or their relations – factors are mostly determined directly by the data.

Factor analysis has had some big successes in helping us to understand how humans differ. As many people know, intelligence is the main factor explaining variation in cognitive test performance, ideology is the main factor explaining variations in political positions, and personality types explain much of the variation in stable attitudes and temperament. These factors have allowed us to greatly advance our understanding of intelligence, ideology, and personality, even while remaining ignorant of their fundamental causes and natures.

However, people vary in far more ways than intelligence, ideology, and personality, and factor analyses have been applied to many of these other human feature categories. For example, there have been factors analyses of jobs, brands, faces, body shape, gait, accent, diet, clothing, writing styleleisure behavior, friendship networks, sleep habitsphysical health, mortality, demography, national cultures, and zip codes.

As my last post on media genre factors showed, factors found in different feature categories are often substantially correlated with one another. This suggests that if we put together a huge super-dataset describing many individual people in as many ways as possible, a factor analysis of this dataset may find important new super-factors that span many of these features domains. Such super-factors would be promising candidates to use in a wide range of social research, and social policy.

Now it remains logically possible that these super-factors will end up being simple linear combinations of the factors that we have already found in each of these feature categories. Maybe we already know most of what there is to know about how humans vary. But I’d bet strongly and heavily against this. The rate at which we have been learning new things about how humans vary doesn’t remotely suggest we’ve run out of new big things to learn. Yes, merely knowing the super-factors isn’t the same as understanding their origins. But just as we’ve seen with factor analysis in more specific areas, knowing the main factors can be a big help.

So I’d guess that the super-factors found in a super dataset of human details will be revolutionary developments. We will afterward see uncovering them as a seminal milestone in our progress in understanding human variation. A Nobel prize worthy level of seminality. All it will take is lots of tedious work to collect a super dataset, and then do some straightforward number crunching. A quest awaits; who will rise to the challenge?

GD Star Rating
Tagged as: , , ,
Trackback URL:
  • sflicht

    It sounds like you might have a prior about how an as-yet-unrecognized super-factor might covary with intelligence, ideology, and personality type. If so, what is that prior?

    • I just know that the fact that factors are orthogonal implies that if category factors are correlated, overall factors will include contributions from multiple categories.

  • stevesailer

    “A Nobel prize worthy level of seminality, or more.”

    I mean, just think of all the Nobels that have been given out to researchers on the general factor of intelligence.

    • Surely the general factor of intelligence was worthy of a Nobel.

      • The general factor of ideology explains why the Nobel prize hasn’t been awarded to the discoverer of the general factor of intelligence.

      • Surely the general factor of intelligence was worthy of a Nobel.

        Then, perhaps, Democritus should receive the Nobel for atomic theory.

        Spearman’s g was an interesting idea, and I’m personally inclined to think it is true (for the record), but Spearman didn’t prove it, even if he thought he did. As late as the 1970s, Horn — as had his mentor R.B. Cattell, arguably the greatest IQ researcher of recent times — denied the theory of g, maintaining that there were two g-like factors, which although correlated were not explainable by a higher-order factor.

        The reigning Cattell-Horn-Carroll IQ model accepts g, but de-emphasizes it relative to group factors at the second stratum. (I’m inclined to think it emphasizes g too little.) Carroll hesitated before accepting g, being unsure about whether the massive amount of evidence he reviewed justified it.

  • zarzuelazen

    I already know them Robin. In fact I now understand the super-factors not only for humans, but for *all* minds in general. Intelligence is constructed of 3 super-factors – the super-factors are the math/logic skills that directly correlate to the following 3 outputs:
    (1) predicates
    (2) probability distributions and
    (3) categories ,
    Personality is constructed of 3 super-factors – the super-factors are the values/ styles that directly correlate to the following 3 outputs:
    (1) goals
    (2) decisions
    (3) signs/signals
    Finally, ideology is constructed of 3 super-factors – the super-factors are the values /styles that directly correlate to the following 3 outputs
    (1) memes
    (2) projects
    (3) narratives
    Please direct all enquiries from the Nobel prize committee to

    • arch1

      Hmm. I think I did see a mysterious note from Stockholm asking whether the above list is complete (it may have included a followup, will let you know if I can unearth it …:-)

  • HM

    My impression is that factor analysis has ben used less in the economics literature. For example, I haven’t seen factor deco positions in the trade literature to analyze sources of comparative advantage, and not much in the institutions literature to find a few dimensions among the large number of institutional measure,

    Is this your impression as well, and if not, what are your favorite references?

  • Simon Jester

    So what happens when you do this analysis and realize you’ve rediscovered race?

    • NameWithheld

      That’s taboo.


    In Western Europe social scientists and statistical agencies have increasingly noticed “highest level of completed education” as a super factor. They basically paint a picture of two separate worlds (vastly different rates of health levels, life expectancy, media preferences, dating preferences, etc…) within society: everyone who at least got a bachelor’s degree at a college/university (note that college tends to be demarcated a bit differently from the US though) and everyone who didn’t, a roughly 50/50 divide in the near future. Of course this isn’t the whole story: separating people for 4 or more of their formative years will inevitably create two separate worlds, no matter what you base the separation on, but I think it qualifies as a super factor that would turn up through unsupervised learning on a super dataset. It is indeed intriguing to think what other super factors might turn up. Although I really hope the people who would interpret these super factors would realize that there are a lot of self-fulfilling prophecies among them.

    • A deeper data analysis might support this guess, but I want to see that analysis.

  • Pingback: In the MISO soup – Mortgage Rates()