If I was going to name the given factors they would be (1) HA = health awareness, (2) EFF= expensive family formation EFF, (3) and C = conscientiousness.
HA is the least interesting to me from a modeling perspective because it appears to be significantly "cultural" in a kind of geographically arbitrary way. However, from the perspective of "changing your mind and behaviors to get a better outcome" it seems like health awareness is the place to focus.
EFF makes sense in terms of being liberal/educated/urbanized and geographically it appears to be happening in areas where the cities are crammed together or pushed against a border, a great lake, or an ocean. If "housing costs" were taken into account I'd expect it to show up as a factor because as housing costs go up, family formation is more expensive, new humans are harder to make, more attention is paid to investing in the relatively less numerous kids, and you need a paying job in order to afford to stay there during retirement.
C looks like a difference between "ice people" and "sun people" to me. The issues that contribute to the "conscientiousness" label are church, crime, and school completion. I wouldn't be surprised if the geographic distributions have a lot to do seasonal affective disorder and snow (neither of which are available to contribute to the factor, but I'd predict that they would be part of it if they were available, another good factor would be per capita hours of air conditioning).
Louisiana and Mississippi are cheap places with sun people. California is expensive with sun people. Montana is cheap with ice people. The only combination that doesn't exist is top quartile in both expense and coldness, but Wisconsin and Michigan are examples that are close to that combination.
Interpreting EFF and C as opposite ends of the same "farmer-forager" axis seems sloppy to me. States like Montana and California fit the single axis model with their opposite extremes, but the states that are high or low in both EFF and C (like Michigan or Mississippi) give lie to the single axis model.
I see "White percentage, Black percentage," but I don't see any reference to Hispanic, Latino, Ethnicity or the like. You do realize there are more Latinos in the U.S. than African-Americans?
"factor 2 seems about left (~forager) idealism – fruit, women’s rights, safety rules, helping the poor, and spending lots on docs and teachers — and factor 3 seems about right (~farmer) idealism — rural, religious, low crime, sexual restraint, make real stuff, finish what you start."
Factor 2 seems to me to be better summed up as "affluence."
Factor 3 is really particular to "Yankee" farmers. The South is very rural as well but has a very different kind of rural culture even though it has some of the same values.
The set of variables also seems heavy on ranked outcomes relative to unranked cultural differences that could explain the differences (e.g. predominant and second most common religious affiliation, religious diversity v. homogenity, dialect area, hierarchy indicators, political affiliation). Similarly, it would also be interesting to see how these factors compare to some of Putnam's social capital indexes and some of Florida's creative class indexes.
"(~forager) idealism – fruit, women’s rights, safety rules, helping the poor, and spending lots on docs and teachers — and factor 3 seems about right (~farmer) idealism — rural, religious, low crime, sexual restraint, make real stuff, finish what you start."
Keep in mind that this doesn't work well with the story you've been telling since you've been framing it as Farmer vs. Forager. For these two things to be separate factors in a factor analysis, they have to be (relatively) independently varying. If you want to interpret the results this way, you should change your story to allow for societies to be high on both kinds of idealism at the same time.
There's also no correlation between spending and educational outcomes -- even though Washington, D.C., which spends the third most per student, has the worst outcome, while the state that spends the least, Utah, has among the best educational outcomes.
So if I'm understanding correctly the features you least for each factor are the features that are most strongly correlated wtih that factor. Assuming the factor is real they would be the features that depend the most on it. You / Ken then assigned a name to that factor based on what those features had in common.
I'm asking because your explanation of factor analysis isn't exactly the same as the example at the wikipedia page on it that you link two. Since in the analysis stage it's all just math it's the same to say that some features drive variance or that an invisible factor drives variance and certain features are most strongly correlated with that factor. It's a direction to attack the problem for the two, though.
I am half with Aron, I don't think the list of factors is too biased towards health. But I also don't think its not surprising that health issues cluster, it's nice to confirm the correlation, but it's not surprising. More interesting the claimed non-correlation between health and health care.
"The fact that health is the biggest factor says that health is very important, even beyond its direct benefit"
Huh? No. It means that you [he] picked a lot of features that are easily clustered as health-related and thus correlate to each other. Rather than, say, baseball stats and mineral deposits. Not surprising in a paper titled: "Essays in Health Economics: Empirical Studies on Determinants of Health"
If I was going to name the given factors they would be (1) HA = health awareness, (2) EFF= expensive family formation EFF, (3) and C = conscientiousness.
HA is the least interesting to me from a modeling perspective because it appears to be significantly "cultural" in a kind of geographically arbitrary way. However, from the perspective of "changing your mind and behaviors to get a better outcome" it seems like health awareness is the place to focus.
EFF makes sense in terms of being liberal/educated/urbanized and geographically it appears to be happening in areas where the cities are crammed together or pushed against a border, a great lake, or an ocean. If "housing costs" were taken into account I'd expect it to show up as a factor because as housing costs go up, family formation is more expensive, new humans are harder to make, more attention is paid to investing in the relatively less numerous kids, and you need a paying job in order to afford to stay there during retirement.
C looks like a difference between "ice people" and "sun people" to me. The issues that contribute to the "conscientiousness" label are church, crime, and school completion. I wouldn't be surprised if the geographic distributions have a lot to do seasonal affective disorder and snow (neither of which are available to contribute to the factor, but I'd predict that they would be part of it if they were available, another good factor would be per capita hours of air conditioning).
Louisiana and Mississippi are cheap places with sun people. California is expensive with sun people. Montana is cheap with ice people. The only combination that doesn't exist is top quartile in both expense and coldness, but Wisconsin and Michigan are examples that are close to that combination.
Interpreting EFF and C as opposite ends of the same "farmer-forager" axis seems sloppy to me. States like Montana and California fit the single axis model with their opposite extremes, but the states that are high or low in both EFF and C (like Michigan or Mississippi) give lie to the single axis model.
When you put up something like those maps, how about doing it so that the key isn't unreadably small?
I always like where Mississippi ends up on these studies: the worst, last, or lowest of whatever positive attribute is being measured.
As Daniel Patrick Moynihan used to say, the easiest way to improve your state's ranking is to tow it up close to the Canadian border.
I see "White percentage, Black percentage," but I don't see any reference to Hispanic, Latino, Ethnicity or the like. You do realize there are more Latinos in the U.S. than African-Americans?
"factor 2 seems about left (~forager) idealism – fruit, women’s rights, safety rules, helping the poor, and spending lots on docs and teachers — and factor 3 seems about right (~farmer) idealism — rural, religious, low crime, sexual restraint, make real stuff, finish what you start."
Factor 2 seems to me to be better summed up as "affluence."
Factor 3 is really particular to "Yankee" farmers. The South is very rural as well but has a very different kind of rural culture even though it has some of the same values.
The set of variables also seems heavy on ranked outcomes relative to unranked cultural differences that could explain the differences (e.g. predominant and second most common religious affiliation, religious diversity v. homogenity, dialect area, hierarchy indicators, political affiliation). Similarly, it would also be interesting to see how these factors compare to some of Putnam's social capital indexes and some of Florida's creative class indexes.
"(~forager) idealism – fruit, women’s rights, safety rules, helping the poor, and spending lots on docs and teachers — and factor 3 seems about right (~farmer) idealism — rural, religious, low crime, sexual restraint, make real stuff, finish what you start."
Keep in mind that this doesn't work well with the story you've been telling since you've been framing it as Farmer vs. Forager. For these two things to be separate factors in a factor analysis, they have to be (relatively) independently varying. If you want to interpret the results this way, you should change your story to allow for societies to be high on both kinds of idealism at the same time.
There's also no correlation between spending and educational outcomes -- even though Washington, D.C., which spends the third most per student, has the worst outcome, while the state that spends the least, Utah, has among the best educational outcomes.
So if I'm understanding correctly the features you least for each factor are the features that are most strongly correlated wtih that factor. Assuming the factor is real they would be the features that depend the most on it. You / Ken then assigned a name to that factor based on what those features had in common.
I'm asking because your explanation of factor analysis isn't exactly the same as the example at the wikipedia page on it that you link two. Since in the analysis stage it's all just math it's the same to say that some features drive variance or that an invisible factor drives variance and certain features are most strongly correlated with that factor. It's a direction to attack the problem for the two, though.
One would probably have to read the paper...
I am half with Aron, I don't think the list of factors is too biased towards health. But I also don't think its not surprising that health issues cluster, it's nice to confirm the correlation, but it's not surprising. More interesting the claimed non-correlation between health and health care.
"The fact that health is the biggest factor says that health is very important, even beyond its direct benefit"
Huh? No. It means that you [he] picked a lot of features that are easily clustered as health-related and thus correlate to each other. Rather than, say, baseball stats and mineral deposits. Not surprising in a paper titled: "Essays in Health Economics: Empirical Studies on Determinants of Health"