What kinds of theorems can we fingerprint?

I kind of love this Notices article, “Fingerprint Databases for Theorems,” by Sara Billey and Bridget Tenner.  They make the excellent point that the OEIS is a really amazing database, not only of integer sequences, but of theorems; if you find some sequence of integers appearing in a computation you’re doing, the OEIS not only helps you guess further terms, but tells you what else is known about that sequence, and where to find it.

Billey and Tenner ask a very natural question I’d never thought of — how much of the rest of mathematical practice can be catalogued in this way?  It is a very hard problem, of course, to develop a full machine-readable database of theorems, which would allow you to ask whether whatever you were trying to prove about the Laurent phenomenon for cluster algebras (say) was already somewhere in the literature.  But Billey and Tenner argue convincingly that the perfect shouldn’t be the enemy of the good here — rather than fantasizing about ultra-intelligent oracles, why not catalog the things we can catalog?  Integer sequences are one such thing.  So are permutation avoidance patterns, which are recorded in a database Tenner maintains.

A fingerprint need not determine the object up to isomorphism.  Billey and Tenner give the example of finite graphs:  you can certainly store a big list of graphs, but checking whether some graph you encounter in nature is isomorphic to something on the list is intractably hard.  On the other hand, storing data like a list of vertex degrees might be good enough in practice to determine whether the graph you’re studying is one with theorems already attached to it.

Readers:  what are good examples of fingerprintable mathematical objects?  Representations of symmetric groups?  (Of course one can in some sense list them all, but the same is true for integer sequences — the point would be to catalog those which actually arose somewhere, and say where.)  Transitive permutation groups?  Hyperbolic 3-manifolds?

Tagged ,

8 thoughts on “What kinds of theorems can we fingerprint?”

1. Hyperbolic 3-manifolds are pretty easy to fingerprint, or as the CS kids say, “hash”. (Hashing is key to fast array-like data structures where the indices are not integers or even of any consistent type, e.g. Python’s dictionaries). Marc Culler and I use this trick in SnapPy to quickly determine if a given manifold is in a database of some 200,000 such.

The main geometric component of our hash is (an approximation to) the manifold’s volume, since there are only finitely many manifolds of any given volume. The interesting topological component is the first homology of all 2 and 3 fold covers; these often have decent sized torsion parts which are essentially random and hence great for distinguishing manifolds.

One can also try to hash finitely presented groups by looking at its finite-index subgroups, though this is only a complete invariant of its profinite completion, not the group itself.

2. Jon Awbrey says:

hypotheses non fingerprinto —

But I’m captivated by the fingerprints of finite partial functions ${f : \mathbb{N}^+ \to \mathbb{N}^+}$

3. Daniel Hast says:

The databases of admissible tuples built up during the “bounded gaps between primes” Polymath project come to mind. Of course, since the data of an admissible tuple is just a finite set of integers, it’s easy to fingerprint. Also, it’s the narrower tuples that are of interest for certain applications (like Zhang’s bound on gaps between primes), so we also have a nice criterion for which ones are interesting to catalog.

4. byesac says:

This is a cool idea. I saw a similar presentation at AMS San Diego.

5. gowers says:

One could probably produce a nice database of (at least some) theorems in extremal graph theory — ones of the form, “If G has property P then the largest possible value of f(G) is at least A/at most B.” If you made a good choice of 20 properties P and 20 parameters f and for each one wrote a short paragraph about what was known (which in many cases would be something like, “It’s trivial to see that we can take f(G)=C and that this is best possible,”) then I think you would already have a useful database. But if it was turned into a reader-editable database with nice drop-down menus for choosing the properties and parameters, I think it would be a wonderful resource, particularly for people who are thinking about extremal graph problems but are not au fait with all the literature.

Probably there are many similar “localized” databases one could imagine. (Another obvious one would be to do with subsets of finite Abelian groups.)

6. gowers says:

I said something slightly unclear above: I meant that for each *pair* (P,f) you write a short paragraph — so 400 short paragraphs in all if there are 20 of each.

7. Michael Giudici says:

All groups of order up to 2000 have been determined and all but those of order 1024 are available in GAP and Magma. Magma also has all transitive groups of degree at most 32 and all primitive groups of degree less than 4096.

8. S. Carnahan says:

Gerald Höhn hosts the database of vertex operator algebras and modular tensor categories. Given the large gap between what we expect to be true and what we know to be true in these particular fields, it might be good to add a lot of “folklore conjecture” data in addition to the usual references to what is proved.