I just had the extremely enjoyable experience of giving the Norbert Wiener lectures at Tufts. I’m not sure my talks lived up to the awesomeness of this poster:
Andrew Bridy, Lalit Jain, Ben Recht and I spent the weekend in Cambridge at Music Hack Day, organized by the Echo Nest and sponsored by just about every company you can think of that cares about both music and technology. We hacked in a somewhat different spirit than most of the folks there; for us, the Million Song Dataset isn’t a tool for app-building, but a playground where we can test ideas about massive networks and information retrieval.
(Re app-building: Bohemian Rhapsicord. Chrome-only.)
We’ve actually been playing with the MSD for a few weeks, and I’ll probably post some of those results later, but let’s start with what we did this weekend. We wanted to see what aspects of the rules of melody we could find in the dataset. Which notes like to follow which other notes? Which chords like to follow which other chords? If you took piano lessons as a kid you already know the answers to these questions. Which is kind of the point! When you start to dig into a giant dataset, the first thing you’d better do is check that it can tell you the things you already know.
We quickly found out that getting a handle on the melodies wasn’t so easy. The song files in the MSD aren’t transcribed from scores, and they don’t have notes: there’s pitch data, but it’s in the form of chromata; these keep good track of how the energy of a song segment is distributed across frequency bands, but they don’t necessarily correspond well to notes. (For instance, what does the chroma of a drum hit sound like?) We found that only about 2% of the songs in the sample had chromata that were “clean” enough to let us infer notes.
But here’s the good thing about a million — 2% of a million is still a lot! Actually, to save time, we only analyzed about 100,000 songs — but that still gave us a couple of thousand songs’ worth of chroma to work with. We threw out all the songs Echo Nest thought were in minor keys, and transposed everything to C. Then we put all the bigrams, or pairs of successive notes, in a big bag, and computed the frequency of each one in the sample. And this is what we saw:
Pretty nice, right? The size of the circle represents the frequency of the note. C (the tonic) and G (the dominant) are the most common notes, just as they should be. And the notes that are actually in the C-major scale are noticably more frequent than those that aren’t. The arrow from note x to note y represents the probability that the note following an x will be y; the thicker and redder the arrow, the greater the transition probability. These, too, look just as they should. The biggest red arrow is the one from B to C, which is because a major seventh (correction from commenter: a leading tone) really wants to resolve to tonic. And the strong “Louie Louie” clique joining C,F, and G is plain to see.
Once you have these numbers, you can start to play around. Lalit wrote a program that generated notes by random-walking along the graph above: the resulting “song” sounds kind of OK! You can hear it at the end of our 2-minute presentation:
Once you have this computation, you can do all kinds of fun things. For example, which songs in the database have the most “unusual” melodies from the point of view of this transition matrix? It turns out that many of the top scorers are indeed songs whose key Echo Nest has misclassified, or which are in keys (like blues scale) that Echo Nest doesn’t recognize. There’s also a lot of stuff like this:
Not exactly “Louie Louie.” Low scorers often sound like this Spiritualized song, with big dynamic shifts but not much tonal stray from the old I-IV-V (and in this case, I think it’s mostly the big red I-V)
A relevant paper: “Clustering beat-chroma patterns in a large music database,” by Thierry Bertin-Mahieux, Ron Weiss, and Daniel Ellis.
Here I am talking linear algebra with Vladimir Viro, who built the amazing Music N-gram Viewer.
Note our team slogan, a bit hard to read on a slant: “DO THE STUPIDEST THING FIRST.”
I had the good luck to be in New York on Friday when David Kazhdan gave an unscheduled lecture at NYU about categorification and representations of finite groups. For people like me, who spend most of our days dismally uncategorified, the talk was a beautiful advertisement for categorification.
Actually, the first twenty minutes of the talk were apparently a beautiful advertisement for the Langlands program, but I got lost coming from the train and missed these. As a result, I don’t know whether the results described below are due to Kazhdan, Kazhdan + collaborators, or someone else entirely. And I missed some definitions — but I think I can transmit Kazhdan’s point even without knowing them. You be the judge.
It went something like this:
Let G be a reductive split group over a finite field k and B a Borel. Then C[G(k)/B(k)] is a representation of G(k), each of whose irreducible constituents is a unipotent representation of G(k). (Note: the definition of “unipotent representation” is one that I missed but it comes from Deligne-Lusztig theory.)
When G = GL_n, all unipotent representations of G appear in C[G(k)/B(k)], so this procedure gives a very clean classification of unipotent representations — they are precisely the constituents of C[G(k)/B(k)]. Equivalently, they are the direct summands of the center of the Hecke algebra C[B(k) \G(k) / B(k)]. For more general G (e.g. Sp_6, E_8) this isn’t the case. Some unipotent representations are missing from C[G(k)/B(k)]!
Where are they?
One category-level up, naturally.
(see what I did there?)
OK, so: instead of C[B(k)\G(k)/B(k)], which is the algebra of B(k)-invariant functions on G(k)/B(k), let’s consider H, the category of B-invariant perverse l-adic sheaves on G/B. (Update: Ben Webster explained that I didn’t need to say “derived” here — Kazhdan was literally talking about the abelian category of perverse sheaves.) This is supposed to be an algebra (H is for “Hecke”) and indeed we have a convolution, which makes H into a monoidal category.
Now all we have to do is compute the center of the category H. And what we should mean by this is the Drinfeld center Z(H). Just as the center of an algebra has more structure than the algebra structure — it is a commutative algebra! — the Drinfeld center of a monoidal category has more structure than a monoidal category — it is a braided monoidal category. It’s Grothendieck Group K_0(Z(H)) (if you like, its decategorification) is just a plain old commutative algebra.
Now you might think that if you categorify C[B(k)\G(k)/B(k)], and then take the (Drinfeld) center, and then decategorify, you would get back the center of C[B(k)\G(k)/B(k)].
But you don’t! You get something bigger — and the bigger algebra breaks up into direct summands which are naturally identified with the whole set of unipotent representations of G(k).
How can we get irreducible characters of G(k) out of Z(H)? This is the function-sheaf correspondence – for each object F of Z(H), and each point x of G(k), you get a number by evaluating the trace of Frobenius on the stalk of F at x. This evidently yields a map from the Grothendieck group K_0(Z(H)) to characters of G(k).
To sum up: the natural representation C[G(k)/B(k)] sometimes sees the whole unipotent representation theory of G(k), but sometimes doesn’t. When it doesn’t, it’s somewhat confusing to understand which representations it misses, and why. But in Kazhdan’s view this is an artifact of working in the Grothendieck group of the thing instead of the thing itself, the monoidal category H, which, from its higher categorical perch, sees everything.
(I feel like the recent paper of Ben-Zvi, Francis and Nadler must have something to do with this post — experts?)
This heuristic has served me well at Au Pied de Cochon in Montreal, and at Le Cochon Dingue in Quebec. Wednesday night it was a winner again in New Orleans. Since some of my readers may still have a New Orleans dinner or two ahead of them, let me recommend Cochon — no more than a 15 minute walk from your special session. If you’ve just got a lunch left, their afternoon deli Cochon Butcher is also supposed to be good (and is also covered by my heuristic.)
At the American Museum of Natural History with CJ and AB, I learned that the best modern thinking gives Tyrannosaurus Rex just two fingers on each hand, not three.
It must have been very demeaning to be killed and eaten by a creature making the ironic quotation gesture.
At least that’s how I felt as a kid, when every visit here included a trip to Green Acres or Magic Carpet. Miniature golf courses are just bigger and better and awesomer here than anywhere else. Today I took CJ to Golf n Stuff, where they not only have a solid golf course but go-carts, a batting cage, and bumper boats. You might think bumper boats would just be a slower, less fun version of bumper cars. But you would quickly change your tune when I told you that bumper boats have water cannons mounted on them.