Mathematicians becoming data scientists: Should you? How to?

I was talking the other day with a former student at UW, Sarah Rich, who’s done degrees in both math and CS and then went off to Twitter.  I asked her:  so what would you say to a math Ph.D. student who was wondering whether they would like being a data scientist in the tech industry?  How would you know whether you might find that kind of work enjoyable?  And if you did decide to pursue it, what’s the strategy for making yourself a good job candidate?

Sarah exceeded my expectations by miles and wrote the following extremely informative and thorough tip sheet, which she’s given me permission to share.  Take it away, Sarah!



This is the thousandth post.

I was going to use this space to give you some statistics, maybe make a Wordle, etc., but I couldn’t figure out how to get WordPress to give me the relevant statistics.

So let’s just say I’ve written a lot of stuff on this blog.  No way is the mean post less than 200 words long, so let’s say close to a thousand print pages.  And I’m really, really glad.  I know lots of people think the blog is dead and we’re all to fling aphorisms at each other on Twitter and Facebook instead.  I love aphorism-flinging, but, for me, blogging sits in a kind of perfect sweet spot; “published” enough that I feel someone’s out there reading, informal enough that I don’t mind making mistakes, short enough that I can bang out a post without compromising a workday, long enough that I can shape an argument that’s not just an aphorism.  Writing this blog, and reading other people’s blogs, has enriched my published writing and my mathematics too.  And I think in some small way it’s been useful to others — the blog has been cited at least 4 times on the arXiv!  That’s more than plenty of my papers.

I don’t care if the blog is dead — if you’re on the fence about starting one, I say you should do it.

A few notes:

  • My most popular post, by a mile, was my post alerting the community to Mochizuki’s claimed proof of ABC, which was linked to by several big sites like Hacker News.  It’s been viewed over 50,000 times.  The next most popular was a post about a hiring controversy in math that I won’t link to because the matter is long settled to everyone’s satisfaction.   Next was a post sharing an anonymous account of treatment at a halfway house which is believed to be by David Foster Wallace.  In fact, of the 10 most popular posts, 7 are about math, 2 are about David Foster Wallace, and the remaining one is Is There Life After Potty Power? which, based on my search logs and the comments, gets a lot of views from people who, after hundreds of viewings, have developed a romantic attachment to the star of a toilet-training video.  
  • From this you should get the basic idea — people like the math posts a lot and the literature posts a fair amount.  And nobody cares about the Orioles at all.
  • When I was considering starting this blog, I asked David Carlton, who’s been doing it much longer, what the secret was to keeping up a blog and not letting it die out.  “Low standards,” he told me.  What he meant:  to blog you have to be willing to to write things that are inarticulate, or not fully-thought-through, or which still have pieces missing; otherwise blog entries (like some math papers!) end up languishing, invisible and unfinished, forever.  I think it would be better for math if those messy and partial ideas were more public than they are, and I think one way for this to happen is for more mathematicians to blog.  And to have low standards.
David Lynch/parenting protip

Don’t watch “Inland Empire” while holding your baby.  Your baby won’t mind, but if you watch a David Lynch movie for ten minutes and then look down at your baby, your baby’s face will freak you out.

Tips for giving talks

Ravi suggested that I should give a stable bloghome to this short .pdf of tips for giving math talks, which I wrote a few years ago for our graduate student conference in number theory.  It’s aimed at people giving their very first seminar talks.  Readers, please add in comments the advice I forgot to put on the tip sheet!

Update: A reader helpfully points out that I basically already wrote this post, less than a year ago, and linked to the same tip sheet.  Sorry!  I have a little baby!  I’m sleepy and I forget things!  Anyway, the link in the old post was dead so at least the repost serves the purpose of making the tipsheet stably available.

What to do in talks

Jason Starr, in comments, makes the excellent point that listing good things to do in talks helps society more than listing bad things.  Here’s his list:

  • Tell a joke.
  • Answer good questions from the audience.
  • Give a simple example before giving a difficult one.
  • Explain some history.
  • Explain why some famous problem is hard.
  • Break the chalk so it doesn’t squeak.
  • Repeat a soft-spoken audience member’s question / remark so everybody hears it.

Good stuff!  In the same spirit, here’s a tip sheet I wrote I few years ago for grad students giving talks at the Graduate Student Conference in Number Theory.  The first tip on the list is “Tell a story.” and I stand by that placement.

Contribute your own must-do’s in comments!

What not to do in talks

Fearing Tavern‘s comment on the previous post links to his exhaustive list of “things not to do in a conference talk”:

use cliche’ expressions
include unecessary equations
read paragraphs from slides
animate unecessarily (vis-a-vis powerpoint)
get caught lying
read formula’s out loud
ignore your time requirement
cry (or sound as though you will)
assume nontrivial background knowledge
mispronounce people’s names
use a separate laptop from previous speaker (often causes technical difficulties)
forget to conclude
change notation
use .AVI movies (or anything else specifically Windows)
pander to famous audience members
be afraid to ask for clarification on audience member questions

I think I’ve done seven of these, though I won’t reveal which.  What about you?

Sex advice from mathematicians

Nerve.com finally consults the real experts.

