Tag Archives: polls

Who does Public Polling Policy think is challenging Scott Walker?

We got a PPP robopoll today.  First of all, I want to note that the recorded voice on the phone was a middle-aged man with the worst case of vocal fry I’ve ever heard.

Anyway.

Much of the poll was of the form “If Republican Scott Walker runs for re-election against Democrat X, who would you support?”  And here are the Democrats they listed:

  • Peter Barca
  • Jon Erpenbach
  • Russ Feingold
  • Steve Kagen
  • Ron Kind
  • Mahlon Mitchell

Are these really the main Democratic contenders?

The poll went on to ask whether I had a favorable or unfavorable opinion of each of the following strange foursome:

  • Sen. Joseph McCarthy
  • Bret Bielema
  • Hilary Clinton
  • Paul Ryan

I’d kind of like to see the crosstabs on that, actually!

Tagged , , , ,

It’s a recall, not an omen

Already time to take back, or at least complicate, the nice things I said about the Times’s Wisconsin coverage.  Today above the fold:

Broadly, the results will be held up as an omen for the presidential race in the fall, specifically for President Obama’s chances of capturing this Midwestern battleground — one that he easily won in 2008 but that Republicans nearly swept in the midterm elections of 2010…

A Marquette Law School telephone poll of 600 likely voters, conducted last week, found Mr. Walker leading 52 percent to 45 percent; the poll’s margin of sampling error was plus or minus 4 percentage points for each candidate.

I suppose I can’t deny that the results “will be held up as” an omen for November’s election by some people.  But those people will be wrong, and the Times should say so.  At the very least they should avoid giving the impression that the recall vote is likely to be predictive of the presidential vote, an assertion for which they give no evidence, not even a quote in support.

I’m just going to repeat what I said in the last post.  Wisconsin is split half and half between Republicans and Democrats.  In nationally favorable Democratic environments (2008) the state votes Democratic.  In nationally favorable Republican environments (2010) the state votes Republican.  At this moment, there’s no national partisan wave, and you can expect Wisconsin elections to be close.  But incumbency is an advantage.  So Walker is winning, and so is Obama. As the Times reports, the Marquette poll has him up 7.  What the Times doesn’t report is that the very same poll has Obama beating Romney by 8.

I guess the recall might be an omen after all — if Walker actually wins by 7, it means there’s no massive shift to the GOP going on in this state, and you’re a broadly popular incumbent President whose hometown is within a half-day’s drive of most of Wisconsin’s population, your prospects here are pretty good.

Arguing against myself:  2006 was also a great year for Democrats nationally, and incumbent Democratic governor Jim Doyle beat Mark Green by only 7.

Tagged , , , , , , , ,

Should confidence intervals take interpoll variation into account?

An interesting fact I learned from Charles Franklin’s talk is that the variance among polls taken on the same day concerning the same election is about 50% greater than what you’d expect from sampling error alone; that’s because different poles use different sampling methodologies, different question phrasings, different likely-voter weightings, etc.

But when polls report a margin of error, they’re using a 95% confidence interval based on the sampling variance alone.  Should they instead be reporting larger error bars, based on what we know empirically about the extent to which poll results vary from the ground truth?

Tagged , ,

Raw polling data as playground

This is a picture of the American electorate!

More precisely; this is a scatterplot I just made using the dataset recently released by PPP, a major political polling firm.  (They’re the outfit that did the “is your state hot or not” poll I blogged about last week.)  PPP has made available the raw responses from 46 polls with 1000 responses each, conducted more or less weekly over the course of 2011.  Here’s the whole thing as a .zip file.

Analyzing data sets like this is in some sense not hard.  But there’s a learning curve.  Little things, like:  you have to know that the .csv format is beautifully portable and universal — it’s the ASCII of data.  You have to know how to get your .csv file into your math package of choice (in my case, python, but I think I could easily have done this in r or MatLab as well) and you have to know where to get a PCA package, if it’s not already installed.  And you have to know how to output a new .csv file and make a graphic from it when you’re done.  (As you can see, I haven’t quite mastered this last part, and have presented you with a cruddy Excel scatterplot.)  In total, this probably took me about three hours to do, and now that I have a data-to-picture path I understand how to use, I think I could do it again in about 30 minutes.  It’s fun and I highly recommend it.  There’s a lot of data out there.

So what is this picture?  The scatterplot has 1000 points, one for each person polled in the December 15, 2011 PPP survey.  The respondents answered a bunch of questions, mostly about politics:

Q1: Do you have a favorable or unfavorable opinion of Barack Obama?
Q2: Do you approve or disapprove of Barack Obama’s job performance?
Q3: Do you think Barack Obama is too liberal, too conservative, or about right?
Q4: Do you approve or disapprove of the job Harry Reid is doing?
Q5: Do you approve or disapprove of the job Mitch McConnell is doing?
Q6: Do you have a favorable or unfavorable opinion of the Democratic Party?
Q7: Do you have a favorable or unfavorable opinion of the Republican Party?
Q8: Generally speaking, if there was an election today, would you vote to reelect Barack Obama, or would you vote for his Republican opponent?
Q9: Are you very excited, somewhat excited, or not at all excited about voting in the 2012 elections?
Q10: If passed into law one version of immigration reform that people have discussed would secure the border and crack down on employers who hire illegal immigrants. It would also require illegal immigrants to register for legal immigration status, pay back taxes, and learn English in order to be eligible for U.S. citizenship. Do you favor or oppose Congress passing this version of immigration reform?
Q11: Have you heard about the $10,000 bet Mitt Romney challenged Rick Perry to in last week’s Republican Presidential debate?
Q12: (Asked only of those who say ‘yes’ to Q11:) Did Romney‚Äôs bet make you more or less likely to vote for him next year, or did it not make a difference either way?
Q13: Do you believe that there’s a “War on Christmas” or not?
Q14: Do you consider yourself to be a liberal, moderate, or conservative?
Q15: Do you consider yourself to be a supporter of the Tea Party or not?
Q16: Are you or is anyone in your household a member of a labor union?
Q17: If you are a woman, press 1. If a man, press 2.
Q18: If you are a Democrat, press 1. If a Republican, press 2. If you are an independent or a member of another party, press 3.
Q19: If you are Hispanic, press 1. If white, press 2. If African American, press 3. If Asian, press 4. If you are an American Indian, press 5. If other, press 6.
Q20: (Asked only of people who say American Indian on Q19:) Are you enrolled in a federally recognized tribe?
Q21: If you are 18 to 29 years old, press 1. If 30 to 45, press 2. If 46 to 65, press 3. If you are older than 65, press 4.
Q22: What part of the country do you live in NOW – the Northeast, the Midwest, the South, or the West?
Q23: What is your household’s annual income?

The answers to these questions, which are coded as integers, now give us 1000 points in R^{23}.  Our eyes are not good at looking at point clouds in 23-dimensional space.  So it’s useful to project down to R^2, that mos bloggable of Euclidean spaces.  But how?  We could just look at two coordinates and see what we get.  But this requires careful choice.  Suppose I map the voters onto the plane via their answers to Q1 and Q2.  The problem is, almost everyone who has a favorable opinion of Barack Obama approves of his job performance, and vice versa.  Considering these two features is hardly better than considering only one feature.  Better would be to look at Q8 and Q21; these two variables are surely less correlated, and studying both together would give us good information on how support for Obama varies with age.  But still, we’re throwing out a lot.  Principal component analysis is a very popular quick-n-dirty method of dimension reduction; it finds the projection onto R^2 (or a Euclidean space of any desired dimension) which best captures the variance in the original dataset.  In particular, the two axes in the PCA projection have correlation zero with each other.

A projection from R^23 to R^2 can be expressed by two vectors, each one of which is some linear combination of the original 23 variables.  The hope is always that, when you stare at the entries of these vectors, the corresponding axis has some “meaning” that jumps out at you.  And that’s just what happens here.

The horizontal axis is “left vs. right.”  It assigns positive weight to approving of Obama, identifying as a liberal, and approving of the Democratic Party, and negative weight to supporting the Tea Party and believing in a “War on Christmas.”  It would be very weird if any analysis of this kind of polling data didn’t pull out political affiliation as the dominant determinant of poll answers.

The second axis is “low-information voter vs. high-information voter,” I think.  It assigns a negative value to all answers of the form “don’t know / won’t answer,” and positive value to saying you are “very excited to vote” and having heard about Mitt Romney’s $10,000 bet.  (Remember that?)

And now the picture already tells you something interesting.  These two variables are uncorrelated, by definition, but they are not unrelated.  The voters split roughly into two clusters, the Democrats and the Republicans.  But the plot is “heart-shaped” — the farther you go into the low-information voters, the less polarization there is between the two parties, until in the lower third of the graph it is hard to tell there are two parties at all.  This phenomenon is not surprising — but I think it’s pretty cool that it pops right out of a completely automatic process.

(I am less sure about the third-strongest axis, which I didn’t include in the plot.  High scorers here, like low scorers on axis 2, tend to give a lot of “don’t know” answers, except when asked about Harry Reid and Mitch McConnell, whom they dislike.  They are more likely to say they’re “not at all excited to vote” and more likely to be independents.  So I think one might call this the “to hell with all those crooks” axis.)

A few technical notes:  I removed questions, like “region of residence,” that didn’t really map on a linear scale, and others, like “income,” that not everyone answered.  I normalized all the columns to have equal variance.  I made new 0-1-valued columns to record “don’t know” answers.  Yes, I know that many people consider it bad news to run PCA on binary variables, but I decided that since I was just trying to draw pictures and not infer anything, it would be OK.

Tagged , , , , , , , ,

America has spoken: Wisconsin is better than Minnesota or Illinois

So says an immensely enjoyable PPP poll, which collected approval/disapproval numbers for all 50 states.   Thanks to Steve Burt for pointing this out to me.  Here’s the full data set with crosstabs.  Great stuff here!  Young people are much more anti-Florida than is the nation as a whole.  Nevada is rejected by “very liberals” and “very conservatives” but applauded by the middle.  Everyone, whatever their politics, slightly dislikes New Jersey.

Seems an appropriate time to listen to John Linnell’s “Oregon (Is Bad)” from the ultra-classic State Songs LP.  Though Americans in fact believe that Oregon is good, by a margin of 43-14.

 

Tagged ,

No good news for Wisconsin Democrats in the first Marquette Law Poll

My colleague Charles Franklin is running a year-long project at Marquette Law School to poll the heck out of Wisconsin in what will surely be a very interesting political environment.  The first poll is out, and it can’t be making Wisconsin Democrats very happy.  Full results here.  All potential recall challengers trail the Governor, though not by much, and the public is either positive or neutral about the most visible parts of Walker’s legislative plan (higher fees for state workers, voter ID, curtailing of collective bargaining.)  Majorities think that Walker’s program will increase jobs in the state and is “better off in the long run” for Wisconsin.  Cutting funding to public schools and BadgerCare, on the other hand, is deeply unpopular, and presumably those issues will play a big role in the recall campaign.  The Governor has access to a titanic amount of money from out of state, and will make sure people here don’t miss out on hearing his point of view.  His opponents may rise in the polls as they gain statewide name recognition, but it’s hard to see in the numbers a huge “anybody but Walker” sentiment.  On top of all that, Tommy Thompson, the only really popular Republican in the state, is going to be back on the campaign trail running for Senate.

The election is a long way away, but Democrats have to be seen as starting from behind.

My guess is that they have a better chance of capturing the State Senate (though I’m told that if Van Wanggard is tossed, his Democratic replacement has less than a year before being redistricted into an election they’re almost sure to lose.)  I wonder when Marquette starts polling the senate races?

(Note:  I was surprised to see that 43% of Marquette’s sample identified as “independent” — but it turns out that 40% of all Americans now give their party ID as independent, the highest proportion Gallup has ever recorded.)

Despite the title I should include the one piece of good news for Democrats; the President remains popular here and seems at the moment to be well ahead of any potential opponent.

Tagged , , , , ,

Scott Walker: not toast

Much was made of the WPR/St. Norbert poll released last week, in which 58% of respondents said they’d vote for Scott Walker’s opponent if a recall comes to pass, with only 38% saying they’d vote to keep the Governor in office.  Worth noting the numbers below the top line, though:  in the sample of 482 voters, 34% reported voting for JoAnne Kloppenburg in April’s Supreme Court election, against 27% who said they voted for Prosser.  In fact, those votes were evenly split.  So it’s way, way, way too soon to say that Walker’s behind in a potential recall election, especially with Wisconsin D’s still in search of a candidate.

(Another interesting result from that poll:  people in Wisconsin apparently really like electing their Supreme Court, and in fact would prefer that the prospective justice’s party affiliation be listed on the ballot!)

 

 

Tagged , , , , , , , ,

The Year in Mathematical Ideas

I have a short piece about Tim Gowers’ Polymath project in the 2009 edition of the New York Times Year in Ideas feature.

In January, Timothy Gowers, a professor of mathematics at Cambridge and a holder of the Fields Medal, math’s highest honor, decided to see if the comment section of his blog could prove a theorem he could not.

It’s been years since we’ve been New York Times subscribers; looking at Sunday’s paper I was struck by how much math was in it. In the Year in Ideas section, besides my piece, there’s one about using random walks to identify species critical to the survival of an ecosystem, another about the differential equations governing zombie diffusion, and a third about Nate Silver’s detective work on the fishy final digits of poll results.  (I blogged about DigitGate a few months back.)  Elsewhere in the paper, John Allen Paulos writes about the expected value of early breast cancer screening, and the Book Review takes on Perfect Rigor, Masha Gessen’s new biography of Perelman.  Personally, I think Gessen missed a huge commercial opportunity by not titling the book He’s Just Not That Into Yau.

Tagged , , , , , , , ,

Strategic Vision done in by the digits?

Nate Silver at 538 looks at the trailing digits of about 5000 poll results from secretive polling outfit Strategic Vision, finds a badly non-uniform distribution, and says this strongly suggests that SV is making up numbers.  I’m a fan of Nate’s stuff, both sabermetric and electoral, but I’m not so sure he’s right on this.

Nate’s argument is similar to that of Beber and Scacco’s article on the fraudulence of Iran’s election returns.  Humans are bad at picking “random” numbers; so the last digits of human-chosen (i.e. fake) numbers will look less uniform than truly random digits would.

There are at least three ways Nate’s case is weaker than Beber and Scacco’s.

  1. In the Iranian numbers, there were too many numbers ending in 7 and too few ending in 0, consistent with the empirical finding that people trying to generate random numbers tend to disfavor “round” numbers like those ending in 0 and 5.  The digits from Strategic Vision have a lot of 7s, but even more 8s, and the 0s and 5s are approximately where they should be.
  2. It’s not so clear to me that the “right” distribution for these digits is uniform.  Lots of 7s and 8s, few 1s; maybe that’s because in close polls with a small proportion of undecideds, you’ll see a lot of 48-47 results and not so many 51-41s.  I don’t really know what the expected distribution of the digits is — but the fact that I don’t know is a big clothespin between my nose and any assertion of a fishy smell.
  3. And of course my prior for “major US polling firm invents data out of whole cloth” is way lower than my prior for the Iranian federal government doing the same thing.  Strategic Vision could run up exactly the same numbers that Beber and Scacco found, and you’d still be correct to trust them more than the Iran election bureau.  Unless your priors are very different from mine.

So I wouldn’t say, as Nate does, that the numbers compiled at 538  “suggest, perhaps strongly, the possibility of fraud.”

Update (27 Sep): More from Nate on the Strategic Vision digits.  Here he directly compares the digits from Strategic Vision to digits gathered by the same protocol from Quinnipiac.  To my eye, they certainly look different.  I think this strengthens his case.  If he ran the same procedure for five other national pollsters, and the other five all looked like Quinnipiac, I think we’d be in the position of saying “There is good evidence that there’s a methodological difference between SV and other pollsters which has an effect on the distribution of terminal digits.”  But it’s a long way from there to “The methodological difference is that SV makes stuff up.”

On the other hand, Nate remarks that the deviation of the Quinnipiac digits from uniformity is consistent with Benford’s Law.  This is a terrible thing to remark.  Benford’s law applies to the leading digit, not the last one.   The fact that Nate would even bring it up in this context makes me feel a little shaky about the rest of his computations.

Also, there’s a good post about this on Pollster by Mark Blumenthal, whose priors about polling firms are far more reliable than mine.

Tagged , , , , , , ,

Reader survey: which of your beliefs will your descendants vehemently reject?

The other day the New York Times ran a selection of 1968 poll data on the op/ed page.  In April of that year, 31% of Americans agreed that “Martin Luther King, Jr. brought his assassination on himself.”

This makes me wonder which beliefs, currently held by 30% or more of the U.S. population, will be universally considered absurd or even despicable by Americans of 2048.  So, readers — nominate such beliefs in the comments.  But to make it interesting, the belief has to be one which you presently hold.

Here’s mine:  “People should strive to keep the details of their personal lives from becoming publicly available.”

(For more antique polling nuggets, see my previous post on Public Opinion 1935-1946.)

Tagged , , ,
%d bloggers like this: