This is one of those topics near and dear to my heart as well! (Though I try not to talk too often about it since it’s at the edge of topical for Lobsters.) Enneagram and MBTI are both widely criticized because their authors created archetypes and tried to force people into those archetypes. The Big Five/OCEAN model is derived from factor analysis of large cohorts. The Open Source Psychometry Project is a great collection of both tests and cohort data available for this sort of analysis.
Wittgenstein would hopefully remind you that words are cultural artifacts, rather than elements of truth. For example, if we took English or other gendered languages seriously, we might be fooled into thinking that there are somewhere between two and four chromosomal arrangements, rather than over a dozen.
I’m disappointed that you didn’t clearly enunciate that psychometrics is navel-gazing bullshit used to reinforce bigotry. The common thread between these psychological models is that they’re all bogus.
You are correct. I saved Wittgenstein for another essay, mainly because I thought it was best to first do an extended riff on these models and why they’re BS without getting into Language Games. This is because (I believe) we coders can understand the wrongness of these models and their relationship to AI much easier than Wittgenstein. Once we grok that, I can back-track and elaborate on the points you made. Thanks.
With science you say - “A causes B by mechanism C”. Coming up with C is what makes science creative. With statistics all you can say is - “Sometimes A and Sometimes B”. And then with enough cherry picking of data you can say “A is strongly co-related with B”. When the explanation for the mechanism C is absent, prejudice … appeal to popularity fill in the role. Probably the biggest mistake of statistics is to make average seem desirable / normal and make quasi-rational decisions appealing.
There’s an interesting nugget of knowledge here that I’m not sure is clear to the average reader.
We believe, and so far it’s been true, that science is mathematical, ie, A->B because of C. This means there is a structural, causal reason for A to imply B, and that’s C. We start with A and B then we infer C. Later we use C to make even more guesses. The more our guesses hold out, the more confident we are that we have C explained well.
However, we start with the statistics. An apocryphal apple fell on Newton’s head. Things fall down. Everywhere he looked, things fell down. He was able to generalize from that a theory of gravity, that is, a formula that described how gravity worked.
But he did not explain gravity, he simply modeled it mathematically. The stats led to the math which was reproducible with observation. We still don’t know the exact mechanism. The math worked very well up until Einstein. Einstein taught us that gravity wasn’t a relationship between objects, it was a curvature of spacetime. Newton’s math and terms weren’t broken. It all continued to work as he described and assumed, but we had new definitions and equations that described and explained things at a much finer level of detail, using a lot of concepts Newton wouldn’t have known, such as the irreducible speed of light. We still didn’t know the mechanism of gravity, but we had better definitions and equations that worked in more places.
The stats lead us to the math which leads us to the (proposed) structure. The stats are definition-less; they’re just numbers we’ve observed using whatever instrumentation we have handy. The math associates those numbers with some vague words (“fall”, and “down”). The structure connects those words with other structures. By connecting up the definitions of the words into bigger and bigger webs, we get finer-tuned models that are applicable in more places.
None of that is how humans communicate. There’s a great scene in “The Wire” where two detectives examine a crime scene and have a meaningful conversation using just the f-word. We just make up whatever works and as long as that works for the purposes of our immediate conversation we’re good. That means that any system that wants to connect loose conversational text to some overall structure of reality is either going to have to be subjective or so obscure, esoteric, and infinitely-detailed that we couldn’t understand it. The conclusion outlined in the essay is that AI needs to be subjective and relationship-based. I made that conclusion because that’s the way humans have evolved and it seems to be the way we solve problems. YMMV.
ADD: There was a great essay that covered this point that I almost referenced in the essay: https://medium.com/swlh/all-models-are-wrong-does-not-mean-what-you-think-it-means-610390c40c9c
I feel that these points could be made more precisely. In particular, correlation is not transitive, but causation is transitive; this creates a barrier between correlation and causation.
They have been. If you’re interested, I highly recommend reading Pearl’s Causality or Koller et al’s Probabilistic Graphical Models. Inferring causation from time series in Earth system sciences is a great paper on the topic we’ve been discussing in this thread.
Ooohh. This is really good. Many thanks.
I’m making a local copy for my notes.