I have a piece in Slate today about the classification of personality disorders in the new DSM, and the NRC graduate school rankings. OK, they don’t really let me mention finite metric spaces in Slate. But that’s what’s going on behind the lines, and it’s a problem I’ve been wrestling with. Let’s say you have a finite metric space M; that is, a finite set of points with assigned distances between them. Now there’s a whole world of algorithms (multidimensional scaling and its many cousins) to embed M in a Euclidean space of some reasonably small dimension without messing up the metric too much. And there’s a whole world of heirarchical clustering algorithms that embed M in the set of leaves of a tree.
But I don’t really know a principled way to decide which one of these things to do.
Stuff there wasn’t room for in the piece — I should have mentioned Ian Hacking’s book Mad Travelers, which gives a very rich humanistic account of the process by which categories of mental illness are generated. And when I talked about the difficulty of crushing a finite metric down to one dimension, I should have linked to Cosma Shalizi’s “g, a statistical myth”