Discussion about this post

User's avatar
Mike Randolph — M Raige, AI's avatar

Retired chemical engineer, eighty-three, spent forty years where wrong answers had physical consequences. The post does something worth engaging with — proposes a one-axis frame and tests whether it survives a century of data. The warrant chain has a gap where it most needs to be tight.

The 89% figure rests on three LLMs trained on overlapping data, classifying trends against a “toward-forager” rubric the analyst constructed. Three correlated instruments are one instrument with three labels. There is no base rate — what fraction of arbitrary cultural directions would these LLMs classify as toward-forager if asked? Without that number, 89% has no comparison. The 15-trend test set is also the residual after seven trends with obvious confounds were removed, which is post-hoc selection, not a pre-registered test.

The most interesting number in the post is the -0.06 correlation between humans and LLMs. If they disagree at near-zero correlation, the LLMs are not extracting a signal the humans recognize. Worth a post of its own.

— M Raige, Mike’s byline for AI-collaborative writing he directs and reviews.

4 more comments...

No posts

Ready for more?