Mathematicians becoming data scientists: Should you? How to?

I was talking the other day with a former student at UW, Sarah Rich, who’s done degrees in both math and CS and then went off to Twitter.  I asked her:  so what would you say to a math Ph.D. student who was wondering whether they would like being a data scientist in the tech industry?  How would you know whether you might find that kind of work enjoyable?  And if you did decide to pursue it, what’s the strategy for making yourself a good job candidate?

Sarah exceeded my expectations by miles and wrote the following extremely informative and thorough tip sheet, which she’s given me permission to share.  Take it away, Sarah!



Continue reading

Tagged , , ,

AB for President

AB was talking about being President this morning.

Me:  I think you could be a really good candidate; you’re funny, and you get along with almost everybody.

AB:  And I have great hair!

She gets it.

Braid monodromy and the dual curve

Nick Salter gave a great seminar here about this paper; hmm, maybe I should blog about that paper, which is really interesting, but I wanted to make a smaller point here.  Let C be a smooth curve in P^2 of degree n. The lines in P^2 are parametrized by the dual P^2; let U be the open subscheme of the dual P^2 parametrizing those lines which are not tangent to C; in other words, U is the complement of the dual curve C*.  For each point u of U, write L_u for the corresponding line in P^2.

This gives you a fibration X -> U where the fiber over a point u in U is L_u – (L_u intersect C).  Since L_u isn’t tangent to C, this fiber is a line with n distinct points removed.  So the fibration gives you an (outer) action of pi_1(U) on the fundamental group of the fiber preserving the puncture classes; in other words, we have a homomorphism

\pi_1(U) \rightarrow B_n

where B_n is the n-strand braid group.

When you restrict to a line L* in U (i.e. a pencil of lines through a point in the original P^2) you get a map from a free group to B_n; this is the braid monodromy of the curve C, as defined by Moishezon.  But somehow it feels more canonical to consider the whole representation of pi_1(U).  Here’s one place I see it:  Proposition 2.4 of this survey by Libgober shows that if C is a rational nodal curve, then pi_1(U) maps isomorphically to B_n.  (OK, C isn’t smooth, so I’d have to be slightly more careful about what I mean by U.)


Tagged , ,

Mark Metcalf

Have you ever heard of this guy?  I hadn’t.  Or thought I hadn’t.  But: he was Niedermeyer in Animal House

and the dad in Twisted Sister’s “We’re Not Gonna Take It” video

and the Master, the Big Bad of Buffy the Vampire Slayer season 1.

That’s a hell of a career! Plus: he lived in suburban Milwaukee until three years ago! And he used to go out with Glenn Close and Carrie Fisher! OK. Now I’ve heard of Mark Metcalf and so have you.

Tagged , , ,

Curry apple chicken

I didn’t have time to do a real shop and had nothing for Friday night dinner so I bought some chicken and a bag of apples and made something that came out surprisingly well; I hereby record it.


2-3 lb boneless chicken breasts

5 apples, cubed

some scallions

some vegetable oil, whatever kind, doesn’t matter, I used olive

1 tbsp ground coriander

1 tbsp ground cumin

1/2 tsp turmeric

1 tsp salt

however much minced garlic you’re into

1/2-1 tsp garam masala

some crushed tomatoes but you could use actual tomatoes if it weren’t the middle of winter


Get oil hot.  Throw apples and scallions in.  Stir and cook 5 mins until apples soft.  Clear off some pan space and put coriander, cumin, turmeric, salt in the oil, let it cook 30 sec – 1 min, then throw in all the chicken, which by the way you cut into chunks, saute it all up until it’s cooked through.  Put the minced garlic in and let that cook for a minute.  Then put in however much tomato you need to combine with everything else in the pan and make a sauce.  (Probably less than you think, you don’t want soup.)  Turn heat down to warm and mix in garam masala.  You could just eat it like this or you could have been making some kind of starch in parallel.  I made quinoa.  CJ liked this, AB did not.

I took the spice proportions from a Madhur Jaffrey recipe but this is in no way meant as actual Indian food, obviously.  I guess I was just thinking about how when I was a kid you would totally get a “curry chicken salad” which was shredded chicken with curry powder, mayonnaise, and chunked up apple, and I sort of wanted a hot mayonnaiseless version of that.  Also, when I was in grad school learning to cook from Usenet with David Carlton, we used to make a salad with broiled chicken and curry mayonnaise and grapes.  I think it was this.  Does that sound right, David?   Yes, that recipe calls for 2 cups of mayonnaise.  It was a different time.  I feel like we would make this and then put it on top of like 2 pounds of rotini and have food for days.



Tagged , ,

Difference sets missing a Hamming sphere

I tend to think of the Croot-Lev-Pach method (as used, for instance, in the cap set problem) as having to do with n-tensors, where n is bigger than 2.  But you can actually also use it in the case of 2-tensors, i.e. matrices, to say things that (as far as I can see) are not totally trivial.

Write m_d for the number of squarefree monomials in x_1, .. x_n of degree at most d; that is,

m_d = 1 + {n \choose 1} + {n \choose 2} + \ldots + {n \choose d}.

Claim:  Let P be a polynomial of degree d in F_2[x_1, .. x_n] such that P(0) = 1.  Write S for the set of nonzero vectors x such that P(x) = 1.  Let A be a subset of F_2^n such that no two elements of A have difference lying in S.  Then |A| < 2m_{d/2}.

Proof:  Write M for the A x A matrix whose (a,b) entry is P(a-b).  By the Croot-Lev-Pach lemma, this matrix has rank at most 2m_{d/2}.  By hypothesis on A, M is the identity matrix, so its rank is |A|.

Remark: I could have said “sum” instead of “difference” since we’re in F_2 but for larger finite fields you really want difference.

The most standard context in which you look for large subsets of F_2^n with restricted difference sets is that of error correcting codes, where you ask that no two distinct elements of A have difference with Hamming weight (that is, number of 1 entries) at most k.

It would be cool if the Croot-Lev-Pach lemma gave great new bounds on error-correcting codes, but I don’t think it’s to be.  You would need to find a polynomial P which vanishes on all nonzero vectors of weight larger than k, but which doesn’t vanish at 0. Moreover, you already know that the balls of size k/2 around the points of A are disjoint, which gives you the “volume bound”

|A| < 2^n / m_{k/2}.

I think that’ll be hard to beat.

If you just take a random polynomial P, the support of P will take up about half of F_2^n; so it’s not very surprising that a set whose difference misses that support has to be small!

Here’s something fun you can do, though.  Let s_i be the i-th symmetric function on x_1, … x_n.  Then

s_i(x) = {wt(x) \choose i}

where wt(x) denotes Hamming weight.  Recall also that the binomial coefficient

{k \choose 2^a}

is odd precisely when the a’th binary digit of k is 1.



is a polynomial of degree 2^b-1 which vanishes on x unless the last b digits of wt(x) are 0; that is, it vanishes unless wt(x) is a multiple of 2^b.  Thus we get:

Fact:  Let A be a subset of F_2^n such that the difference of two nonzero elements in A never has weight a multiple of 2^b.  Then

|A| \leq 2m_{2^{b-1} - 1}.

Note that this is pretty close to sharp!  Because if we take A to be the set of vectors of  weight at most 2^{b-1} – 1, then A clearly has the desired property, and already that’s half as big as the upper bound above.  (What’s more, you can throw in all the vectors of weight 2^{b-1} whose first coordinate is 1; no two of these sum to something of weight 2^b.  The Erdös-Ko-Rado theorem says you can do no better with those weight 2^{b-1} vectors.)

Is there an easier way to prove this?

When b=1, this just says that a set with no differences of even Hamming weight has size at most 2; that’s clear, because two vectors whose Hamming weight has the same parity differ by a vector of even weight.  Even for b=2 this isn’t totally obvious to me.  The result says that a subset of F_2^n with no differences of weight divisible by 4 has size at most 2+2n.  On the other hand, you can get 1+2n by taking 0, all weight-1 vectors, and all weight-2 vectors with first coordinate 1.  So what’s the real answer, is it 1+2n or 2+2n?

Write H(n,k) for the size of the largest subset of F_2^n having no two vectors differing by a vector of Hamming weight exactly k.  Then if 2^b is the largest power of 2 less than n, we have shown above that

m_{2^{b-1} - 1 } \leq H(n,2^b) \leq 2m_{2^{b-1} - 1}.

On the other hand, if k is odd, then H(n,k) = 2^{n-1}; we can just take A to be the set of all even-weight vectors!  So perhaps H(n,k) actually depends on k in some modestly interesting 2-adic way.

The sharpness argument above can be used to show that H(4m,2m) is as least

2(1 + 4m + {4m \choose 2} + \ldots + {4m \choose m-1} + {4m-1 \choose m-1}). (*)

I was talking to Nigel Boston about this — he did some computations which make it looks like H(4m,2m) is exactly equal to (*) for m=1,2,3.  Could that be true for general m?

(You could also ask about sets with no difference of weight a multiple of k; not sure which is the more interesting question…)

Update:  Gil Kalai points out to me that much of this is very close to and indeed in some parts a special case of the Frankl-Wilson theorem…  I will investigate further and report back!

Tagged , , ,

Vacuum cleaner on sale

I was explaining the “regular price” scam to CJ the other day. A store sells a vacuum cleaner for $79.95. One day, they put up a sign saying “SALE! Regular price, $109.95; now MARKED DOWN to $79.95.” The point is to create an imaginary past that never was, a past where vacuum cleaners cost $109.95, a difficult past from which the store has generously granted you respite.
This is what Trump’s team is doing. They’re trying to create an imaginary past in which the last 5 years of life in America was characterized by ubiquitous street crime, unchecked terrorism, and mass unemployment. So that life in America in 2017 and 2018 will seem comparatively placid, safe, and prosperous. Look how much I saved you on this goddamn vacuum cleaner. You’re welcome.


Holocaust Remembrance Day

I was always skeptical of the U.S. Holocaust Memorial Museum.  I understand why there’s a Holocaust museum in Israel, and I understand why there’s one in Berlin, and I understand why there are memorials in the places where the slaughter was carried out.  But why in Washington D.C.?

Then I went there, I got it.  The reason there’s a Holocaust Museum in America is because it’s a Holocaust Museum that’s about America.  The museum tells the story, as all Holocaust museums must, of the German slaughter of the Jews, and of gay people, disabled people, Roma and Slavs besides.  But the focus is on what happened here in America, and what didn’t happen.  The point of the museum is to ask a simple question:  what is the responsibility of a rich, safe, comfortable country to the human beings being hunted down and killed outside its borders?

OK, I’ll admit it, that’s not a simple question.  But it’s a question worth thinking about today.


Tagged ,

Bad Bayesian

Couldn’t find my phone yesterday morning.  I definitely remembered having it in the car on the way home from the kids’ swim lesson, so I knew I hadn’t left it.  “Find my iPhone” told me the phone was on the blacktop of the elementary school, about 1000 feet from my house.  What?  Why?  Then a few minutes later the location updated to the driveway of a bank, closer to my house but in the other direction.  So I went over to the bank and looked around in the driveway, even peering into the garbage shed and seeing if my phone was in their dumpster.

But why did I do that?  It was terrible reason.  There was no chain of events leaving my phone at the bank, or at the school, which wasn’t incredibly a prior unlikely.  I should have reasoned:  “The insistence of Find my iPhone that my phone is at the bank drastically increases the probability my phone is at the bank, but that probability started out so tiny that it remains tiny, and the highest-expected-utility use of my time is to keep looking around my house and my car until I find it.”

Anyway, it was in the basement.




In which South Dakota can’t actually do that

Voters in South Dakota approved a sweeping new government ethics law by referendum in November.  The South Dakota state legislature just overruled them.

Can they do that?

Yes and no.  A law passed by referendum is, in the end, a law; and laws can be repealed by the legislature.  A lot of states have protections against this practice, which is called “legislative tampering.”  South Dakota is one of 12 states that doesn’t.  So if you think the people can band together and pass a law by ballot measure there, you’re only sort of right; if the people’s will goes against the will of the legislative majority, as in this case, right out the window goes the popular vote.

But the legislature did more.  The bill, HB 1069, finishes off with the following language:

Section 35. Whereas, this Act is necessary for the support of the state government and its existing public institutions, an emergency is hereby declared to exist, and this Act shall be in
full force and effect from and after its passage and approval.

The declaration of an “emergency” does two things:  it makes the bill active immediately upon passage, thus preventing the ethics commission from coming into existence even temporarily, and it prevents the people of South Dakota from launching a new referendum to veto the repeal and restore the ethics law.

I don’t think they can do this.

Under the South Dakota Constitution, some kinds of laws are subject to veto by popular referendum and some are not.  Those laws protected from referendum are categorized as follows:

However, the people expressly reserve to themselves the right to propose measures, which shall be submitted to a vote of the electors of the state, and also the right to require that any laws which the Legislature may have enacted shall be submitted to a vote of the electors of the state before going into effect, except such laws as may be necessary for the immediate preservation of the public peace, health or safety, support of the state government and its existing public institutions.

The legislature doesn’t have the power to protect a law from referendum just by declaring an emergency.  It has to fall under one of the two protected categories delineated in Section 1 above.  This was laid out in Lindstrom v. Goetz (1951):

 Only those laws which are not subject to the referendum, according to § 1 are subject to the emergency clause authorized by § 22. State ex rel. Richards v. Whisman, 36 S.D. 260, 154 N.W. 707, L.R.A. 1917B, 1. The same rule applies to municipalities. City of Colome v. Von Seggern Bros. Ludden, 56 S.D. 390, 228 N.W. 800. Whether a law or ordinance is subject to the referendum is a judicial question. If it is found that the law is not subject to the referendum, the legislative declaration of an emergency is conclusive. If it be found that the law is subject to the referendum, the declaration of an emergency is void, for then no emergency could exist.

The Legislature, in HB 1069,  invoked the second category of protection.  That’s what I don’t think they can do.  The interpretation of what counts as “necessary for … support of state government” in South Dakota has traditionally been pretty broad, encompassing laws designed to enhance or even redistribute state revenue.  But it’s not unlimited.  Check this out, from “Restrictions on Initiative and Referendum Powers in South Dakota,” Lowe, Chip J., 28 S.D. L. Rev. 53 (1982-1983), p.61:


I don’t think you can say, with a straight face, that eliminating the independent ethics commission is an appropriation bill or a taxing measure.  So their claim dies here:  they can overrule the referendum, but they can’t prevent the public from overruling them right back.

Tagged , , ,
%d bloggers like this: