MALBEC: Jerry Zhu, Michael Coen, how to say snake in gibbon

Jerry Zhu will give the  last MALBEC seminar of the year tomorrow (Wednesday) afternoon, at 4pm, in Van Vleck B102:

Jerry Zhu (UW, computer sciences)

HAMLET (Human, Animal, and Machine Learning: Experiment and Theory)

Machine learning studies the principles governing all learning systems. Human beings and animals are learning systems too, and can be explored using the same mathematical tools.  This approach has been fruitful in the last few decades with standard tools such as reinforcement learning, artificial neural networks, and non-parametric Bayesian statistics.  We bring the approach one step further with some latest tools in machine learning, and uncover new quantitative findings.  In this talk, I will present three examples: (1) Human semi-supervised learning. Consider a child learning animal names.  Dad occasionally points to an animal and says “Dog!” (labeled data). But mostly the child observes the world by herself without explicit feedback (unlabeled data).  We show that humans learn from both labeled and unlabeled data, and that a simple Gaussian Mixture Model trained using the EM algorithm provides a nice fit to human behaviors.  (2) Human active learning.  The child may ask “What’s that?”, i.e. actively selecting items to query the target labels.  We show that humans are able to perform good active learning, achieving fast exponential error convergence as predicted by machine learning theory.  In contrast, when passively given i.i.d. training data humans learn much slower (polynomial convergence), also predicted by learning theory.  (3) Monkey online learning.  Rhesus monkeys can learn a “target concept”, in the form of a certain shape or color.  What if the target concept keeps changing?  Adversarial online learning model provides a polynomial mistake bound.  Although monkeys perform worse than theory, anecdotal evidence suggests that they follow the concepts better than some graduate students. Finally, I will speculate on a few lessons learned in order to create better machine learning algorithms.

In the third MALBEC lecture, Michael Coen talked about his work on clustering; he asked me afterwards whether I thought the talk was “mathy enough” for the audience, which was funny, because I thought it was 100% math from start to finish!  Here’s a cartoon of the main idea.  When presented with a giant set of data points, one of the first things you might want to do is cluster it:  that is, partition the points into some disjoint collection of subsets, each one of which consists of points which resemble their clustermates more than they do the points in the other clusters.  You might, for instance, want to identify clusters among U.S. legislators, or images, or gene expression patterns. As is so often the case, Cosma Shalizi supplies a good, succinct introduction to the topic from a statistician’s perspective.

How do you know when your clustering algorithm is good?  Sometimes there’s a natural way to evaluate; if your algorithm for clustering legislators reliably separates Democrats from Republicans, you’re probably doing something right.  But with other data, you might not have any pre-existing classification that helps you gauge the reasonableness of your clustering.  Let’s say, for instance, you have lots of short recordings of a gibbon; maybe you think that rather than being scattered structurelessly around the space of 1-second sound clips, they fall into a small finite set of clusters, which you would naturally be tempted to call phonemes. You can run a clustering algorithm on the clips, and you’ll get an answer.  But is it meaningful?  It’s hard to tell without a population of clips which are classified in advance.  Unfortunately, there’s no corpus of written gibbon texts which you can ask gibbons to read aloud.  So you have to do something else.

The trick, as Coen observes, is to replace the difficult and not-very-well-defined question “Is clustering X good?” with the much more tractable question “Are clusterings X and Y similar?”  Coen presented a really nice, principled way of answering this latter question, which allows him to do something like the following:  given your set of audio clips, apply your clustering algorithm separately  to two random samples of 50% of the data points.  These two samples will overlap in around 25% of the data.  Now you can use Coen’s measure to compare the two clusterings induced on this 25% subsample.  If you do this a lot, and you always get two clusterings which are almost exactly the same in Coen’s sense, that’s a good sign that your clustering algorithm is actually capturing real features of the data.

So it turns out that gibbon utterances really do seem to be organized into phonemes.  (A cursory google search suggests that this contradicts conventional wisdom about primate vocalization — can any primatologists weigh in?)  Once you have this finding, and the ability to classify the sound clips, you can do some remarkable things:  for instance, you can look at what combinations of phonemes gibbons emit when a snake comes by.  It turns out that the vocalization elicited by a snake isn’t a consistent combination of phonemes, as it would be in a human language.  Rather, you can write down a finite state automaton, any one of whose outputs seems to be a legitimate gibbon word for “snake”!

Coen had a picture of the automaton on a slide, which is truly cool, but which he is keeping to himself until the paper’s published.  I promise to tell you exactly how to say “snake” in gibbon in a later post.

Tagged , , , , , , , , , ,

8 thoughts on “MALBEC: Jerry Zhu, Michael Coen, how to say snake in gibbon

  1. Eric Hsu says:

    Wow, what a cool lecture series! I wish I could attend them. Maybe I’ll try to put something similar together in SF…

  2. Cosma says:

    I don’t suppose there’s a paper out already from Coen on this? It sounds like a very neat example, but unfortunately I can’t find anything online.

  3. JSE says:

    I think the relevant papers are “Comparing Clusterings” (w/ Ansari) for the general method, and “On the Universality of Phonology in Primates” (w/ Raimy, Dassow, Clarke) for the gibbon lexicon; both are listed as “in preparation” so we’ll just have to wait.

  4. Cosma says:

    Hopefully they’ll be out by September when I’ll be teaching clustering again…

  5. Do you know if the collection of gibbon words for snake vary by population? Were the samples taken from some particular population? (I’m thinking of the comparable human situation; it doesn’t really make sense to talk about saying “snake” in human.)

  6. JSE says:

    I asked Michael this same question. The original dataset is from a single wild population in Thailand; there is work in progress to answer the question you ask.

  7. […] Among the latter: Jordan’s in Slate on the flu: why is it so hard to figure out how many people “die of” the flu every year? A few days ago he learned how to say snake in gibbon. […]

  8. Z says:

    Surely the collection of gibbon words must vary. The alternative would seem to imply that gibbons have a hardwired knowledge of snakes. Though not impossible (primates do show a strong instinctual aversion for snakes), this is going very far in Fodor’s style of semantics.

    Interestingly, I think both what (some linguists think) we know about human languages and this experiment supports the hypothesis that the “speech production” (for lack of a better words) of gibbons related to snakes might not be “words” in the sense of lexical items but syntactic structures. For instance, babies recognize basic syntactic structures much earlier than they recognize lexical contents (concretely, they are able to distinguish between “lexical-free” words such as “of” from “lexical” words such as “cat” as early as 2 to 3 months old, whereas the meaning of “cat” they will understand maybe 1 year later). One could imagine that primates have developed a communication system based on acquired reactions to some syntactic structures: in a given gibbon tribe, whenever a gibbon produces a syntactic structure of a certain type, it means a snake is coming. Other gibbons recognize the syntactic structure with a given hardwired parser (hence the finite automaton property) much in the same way that human beings know in an eye-blink that in the sentence “He thinks Matthew Emerton wears invisible yellow horns”, “he” is not “Matthew Emerton” even though the semantic of the sentence is quite unclear (an interesting point is that one can show easily enough that this property cannot be parsed by a finite automaton, so this would seem to distinguish gibbons and humans in that respect).

    From (what I know of) what we know of language acquisition, this kind of communication would require much less brain power, and hence could evolve much more easily, than the apparently simpler system of labeling objects of the external world with phonemes (“snake” refers to snake).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: