Category Archives: language

Gendercycle: a dynamical system on words

By the way, here’s another fun word2vec trick.  Following Ben Schmidt, you can try to find “gender-neutralized synonyms” — words which are close to each other except for the fact that they have different gender connotations.   A quick and dirty way to “mascify” a word is to find its nearest neighbor which is closer to “he” than “she”:

def mascify(y): return [x[0] for x in model.most_similar(y,topn=200) if model.similarity(x[0],’she’) < model.similarity(x[0],’he’)][0]

“femify” is defined similarly.  We could put a threshold away from 0 in there, if we wanted to restrict to more strongly gender-coded words.

Anyway, if you start with a word and run mascify and femify alternately, you can ask whether you eventually wind up in a 2-cycle:  a pair of words which are each others gender counterparts in this loose sense.

e.g.

gentle -> easygoing -> chatty -> talkative -> chatty -> …..

So “chatty” and “talkative” are a pair, with “chatty” female-coded and “talkative” male-coded.

beautiful -> magnificent -> wonderful -> marvelous -> wonderful -> …

So far, I keep hitting 2-cycles, and pretty quickly, though I don’t see why a longer cycle wouldn’t be possible or likely.  Update:  Kevin in comments explains very nicely why it has to terminate in a 2-cycle!

Some other pairs, female-coded word first:

overjoyed / elated

strident / vehement

fearful / worried

furious / livid

distraught / despondent

hilarious / funny

exquisite / sumptuous

thought_provoking / insightful

kick_ass / badass

Sometimes it’s basically giving the same word, in two different forms or with one word misspelled:

intuitive / intuitively

buoyant / bouyant

sad / Sad

You can do this for names, too, though you have to make the “topn” a little longer to find matches.  I found:

Jamie / Chris

Deborah / Jeffrey

Fran / Pat

Mary / Joseph

Pretty good pairs!  Note that you hit a lot of gender-mixed names (Jamie, Chris, Pat), just as you might expect:  the male-biased name word2vec-closest to a female name is likely to be a name at least some women have!  You can deal with this by putting in a threshold:

>> def mascify(y): return [x[0] for x in model.most_similar(y,topn=1000) if model.similarity(x[0],’she’) < model.similarity(x[0],’he’) – 0.1][0]

This eliminates “Jamie” and “Pat” (though “Chris” still reads as male.)

Now we get some new pairs:

Jody / Steve (this one seems to have a big basis of attraction, it shows up from a lot of initial conditions)

Kasey / Zach

Peter / Catherine (is this a Russia thing?)

Nicola / Dominic

Alison / Ian

 

 

 

 

 

Tagged , , , ,

Messing around with word2vec

Word2vec is a way of representing words and phrases as vectors in medium-dimensional space developed by Tomas Mikolov and his team at Google; you can train it on any corpus you like (see Ben Schmidt’s blog for some great examples) but the version of the embedding you can download was trained on about 100 billion words of Google News, and encodes words as unit vectors in 300-dimensional space.

What really got people’s attention, when this came out, was word2vec’s ability to linearize analogies.  For example:  if x is the vector representing “king,” and y the vector representing “woman,” and z the vector representing “man,” then consider

x + y – z

which you might think of, in semantic space, as being the concept “king” to which “woman” has been added and “man” subtracted — in other words, “king made more female.”  What word lies closest in direction to x+y-z?  Just as you might hope, the answer is “queen.”

I found this really startling.  Does it mean that there’s some hidden linear structure in the space of words?

It turns out it’s not quite that simple.  I played around with word2vec a bunch, using Radim Řehůřek’s gensim package that nicely pulls everything into python; here’s what I learned about what the embedding is and isn’t telling you.

Word2Vec distance isn’t semantic distance

The Word2Vec metric tends to place two words close to each other if they occur in similar contexts — that is, w and w’ are close to each other if the words that tend to show up near w also tend to show up near w’  (This is probably an oversimplification, but see this paper of Levy and Goldberg for a more precise formulation.)  If two words are very close to synonymous, you’d expect them to show up in similar contexts, and indeed synonymous words tend to be close:

>>> model.similarity(‘tremendous’,’enormous’)

0.74432902555062841

The notion of similarity used here is just cosine distance (which is to say, dot product of vectors.)  It’s positive when the words are close to each other, negative when the words are far.  For two completely random words, the similarity is pretty close to 0.

On the other hand:

>>> model.similarity(‘tremendous’,’negligible’)

0.37869063705009987

Tremendous and negligible are very far apart semantically; but both words are likely to occur in contexts where we’re talking about size, and using long, Latinate words.  ‘Negligible’ is actually one of the 500 words closest to ’tremendous’ in the whole 3m-word database.

You might ask:  well, what words in Word2Vec are farthest from “tremendous?”  You just get trash:

>>> model.most_similar(negative=[‘tremendous’])

[(u’By_DENISE_DICK’, 0.2792186141014099), (u’NAVARRE_CORPORATION’, 0.26894450187683105), (u’By_SEAN_BARRON’, 0.26745346188545227), (u’LEGAL_NOTICES’, 0.25829464197158813), (u’Ky.Busch_##-###’, 0.2564955949783325), (u’desultorily’, 0.2563159763813019), (u’M.Kenseth_###-###’, 0.2562236189842224), (u’J.McMurray_###-###’, 0.25608277320861816), (u’D.Earnhardt_Jr._###-###’, 0.2547803819179535), (u’david.brett_@_thomsonreuters.com’, 0.2520599961280823)]

If 3 million words were distributed randomly in the unit ball in R^300, you’d expect the farthest one from “tremendous” to have dot product about -0.3 from it.  So when you see a list whose largest score is around that size, you should think “there’s no structure there, this is just noise.”

Antonyms

Challenge problem:  Is there a way to accurately generate antonyms using the word2vec embedding?  That seems to me the sort of thing the embedding is not capturing.  Kyle McDonald had a nice go at this, but I think the lesson of his experiment is that asking word2vec to find analogies of the form “word is to antonym as happy is to?” is just going to generate a list of neighbors of “happy.”  McDonald’s results also cast some light on the structure of word2vec analogies:  for instance, he finds that

waste is to economise as happy is to chuffed

First of all, “chuffed” is a synonym of happy, not an antonym.  But more importantly:  The reason “chuffed” is there is because it’s a way that British people say “happy,” just as “economise” is a way British people spell “economize.”  Change the spelling and you get

waste is to economize as happy is to glad

Non-semantic properties of words matter to word2vec.  They matter a lot.  Which brings us to diction.

Word2Vec distance keeps track of diction

Lots of non-semantic stuff is going on in natural language.  Like diction, which can be high or low, formal or informal, flowery or concrete.    Look at the nearest neighbors of “pugnacity”:

>>> model.most_similar(‘pugnacity’)

[(u’pugnaciousness’, 0.6015268564224243), (u’wonkishness’, 0.6014434099197388), (u’pugnacious’, 0.5877301692962646), (u’eloquence’, 0.5875781774520874), (u’sang_froid’, 0.5873805284500122), (u’truculence’, 0.5838015079498291), (u’pithiness’, 0.5773230195045471), (u’irascibility’, 0.5772287845611572), (u’hotheadedness’, 0.5741063356399536), (u’sangfroid’, 0.5715578198432922)]

Some of these are close semantically to pugnacity, but others, like “wonkishness,” “eloquence”, and “sangfroid,” are really just the kind of elevated-diction words the kind of person who says “pugnacity” would also say.

In the other direction:

>>> model.most_similar(‘psyched’)

[(u’geeked’, 0.7244787216186523), (u’excited’, 0.6678282022476196), (u’jazzed’, 0.666187584400177), (u’bummed’, 0.662735104560852), (u’amped’, 0.6473385691642761), (u’pysched’, 0.6245862245559692), (u’exicted’, 0.6116108894348145), (u’awesome’, 0.5838013887405396), (u’enthused’, 0.581687331199646), (u’kinda_bummed’, 0.5701783299446106)]

“geeked” is a pretty good synonym, but “bummed” is an antonym.  You may also note that contexts where “psyched” is common are also fertile ground for “pysched.”  This leads me to one of my favorite classes of examples:

Misspelling analogies

Which words are closest to “teh”?

>>> model.most_similar(‘teh’)

[(u’ther’, 0.6910992860794067), (u’hte’, 0.6501408815383911), (u’fo’, 0.6458913683891296), (u’tha’, 0.6098173260688782), (u’te’, 0.6042138934135437), (u’ot’, 0.595798909664154), (u’thats’, 0.595078706741333), (u’od’, 0.5908242464065552), (u’tho’, 0.58894944190979), (u’oa’, 0.5846965312957764)]

Makes sense:  the contexts where “teh” is common are those contexts where a lot of words are misspelled.

Using the “analogy” gadget, we can ask; which word is to “because” as “teh” is to “the”?

>>> model.most_similar(positive=[‘because’,’teh’],negative=[‘the’])

[(u’becuase’, 0.6815075278282166), (u’becasue’, 0.6744950413703918), (u’cuz’, 0.6165347099304199), (u’becuz’, 0.6027254462242126), (u’coz’, 0.580410361289978), (u’b_c’, 0.5737690925598145), (u’tho’, 0.5647958517074585), (u’beacause’, 0.5630674362182617), (u’thats’, 0.5605655908584595), (u’lol’, 0.5597798228263855)]

Or “like”?

>>> model.most_similar(positive=[‘like’,’teh’],negative=[‘the’])

[(u’liek’, 0.678846001625061), (u’ok’, 0.6136218309402466), (u’hahah’, 0.5887773633003235), (u’lke’, 0.5840467214584351), (u’probly’, 0.5819578170776367), (u’lol’, 0.5802655816078186), (u’becuz’, 0.5771245956420898), (u’wierd’, 0.5759704113006592), (u’dunno’, 0.5709049701690674), (u’tho’, 0.565370500087738)]

Note that this doesn’t always work:

>>> model.most_similar(positive=[‘should’,’teh’],negative=[‘the’])

[(u’wil’, 0.63351970911026), (u’cant’, 0.6080706715583801), (u’wont’, 0.5967696309089661), (u’dont’, 0.5911301970481873), (u’shold’, 0.5908039212226868), (u’shoud’, 0.5776053667068481), (u’shoudl’, 0.5491836071014404), (u”would’nt”, 0.5474458932876587), (u’shld’, 0.5443994402885437), (u’wouldnt’, 0.5413904190063477)]

What are word2vec analogies?

Now let’s come back to the more philosophical question.  Should the effectiveness of word2vec at solving analogy problems make us think that the space of words really has linear structure?

I don’t think so.  Again, I learned something important from the work of Levy and Goldberg.  When word2vec wants to find the word w which is to x as y is to z, it is trying to find w maximizing the dot product

w . (x + y – z)

But this is the same thing as maximizing

w.x + w.y – w.z.

In other words, what word2vec is really doing is saying

“Show me words which are similar to x and y but are dissimilar to z.”

This notion makes sense applied any notion of similarity, whether or not it has anything to do with embedding in a vector space.  For example, Levy and Goldberg experiment with minimizing

log(w.x) + log(w.y) – log(w.z)

instead, and get somewhat superior results on the analogy task.  Optimizing this objective has nothing to do with the linear combination x+y-z.

None of which is to deny that the analogy engine in word2vec works well in many interesting cases!  It has no trouble, for instance, figuring out that Baltimore is to Maryland as Milwaukee is to Wisconsin.  More often than not, the Milwaukee of state X correctly returns the largest city in state X.  And sometimes, when it doesn’t, it gives the right answer anyway:  for instance, the Milwaukee of Ohio is Cleveland, a much better answer than Ohio’s largest city (Columbus — you knew that, right?)  The Milwaukee of Virginia, according to word2vec, is Charlottesville, which seems clearly wrong.  But maybe that’s OK — maybe there really isn’t a Milwaukee of Virginia.  One feels Richmond is a better guess than Charlottesville, but it scores notably lower.  (Note:  Word2Vec’s database doesn’t have Virginia_Beach, the largest city in Virginia.  That one I didn’t know.)

Another interesting case:  what is to state X as Gainesville is to Florida?  The answer should be “the location of the, or at least a, flagship state university, which isn’t the capital or even a major city of the state,” when such a city exists.  But this doesn’t seem to be something word2vec is good at finding.  The Gainesville of Virginia is Charlottesville, as it should be.  But the Gainesville of Georgia is Newnan.  Newnan?  Well, it turns out there’s a Newnan, Georgia, and there’s also a Newnan’s Lake in Gainesville, FL; I think that’s what’s driving the response.  That, and the fact that “Athens”, the right answer, is contextually separated from Georgia by the existence of Athens, Greece.

The Gainesville of Tennessee is Cookeville, though Knoxville, the right answer, comes a close second.

Why?  You can check that Knoxville, according to word2vec, is much closer to Gainesville than Cookeville is.

>>> model.similarity(‘Cookeville’,’Gainesville’)

0.5457580604439547

>>> model.similarity(‘Knoxville’,’Gainesville’)

0.64010456774402158

But Knoxville is placed much closer to Florida!

>>> model.similarity(‘Cookeville’,’Florida’)

0.2044376252927515

>>> model.similarity(‘Knoxville’,’Florida’)

0.36523836770416895

Remember:  what word2vec is really optimizing for here is “words which are close to Gainesville and close to Tennessee, and which are not close to Florida.”  And here you see that phenomenon very clearly.  I don’t think the semantic relationship between ‘Gainesville’ and ‘Florida’ is something word2vec is really capturing.  Similarly:  the Gainesville of Illinois is Edwardsville (though Champaign, Champaign_Urbana, and Urbana are all top 5) and the Gainesville of Indiana is Connersville.  (The top 5 for Indiana are all cities ending in “ville” — is the phonetic similarity playing some role?)

Just for fun, here’s a scatterplot of the 1000 nearest neighbors of ‘Gainesville’, with their similarity to ‘Gainesville’ (x-axis) plotted against their similarity to ‘Tennessee’ (y-axis):

Screen Shot 2016-01-14 at 14 Jan 4.37.PM

The Pareto frontier consists of “Tennessee” (that’s the one whose similarity to “Tennessee” is 1, obviously..) Knoxville, Jacksonville, and Tallahassee.

Bag of contexts

One popular simple linear model of word space is given by representing a word as a “bag of contexts” — perhaps there are several thousand contexts, and each word is given by a sparse vector in the space spanned by contexts:  coefficient 0 if the word is not in that context, 1 if it is.  In that setting, certain kinds of analogies would be linearized and certain kinds would not.  If “major city” is a context, then “Houston” and “Dallas” might have vectors that looked like “Texas” with the coodinate of “major city” flipped from 0 to 1.  Ditto, “Milwaukee” would be “Wisconsin” with the same basis vector added.  So

“Texas” + “Milwaukee” – “Wisconsin”

would be pretty close, in that space, to “Houston” and “Dallas.”

On the other hand, it’s not so easy to see what relations antonyms would have in that space. That’s the kind of relationship the bag of contexts may not make linear.

The word2vec space is only 300-dimensional, and the vectors aren’t sparse at all.  But maybe we should think of it as a random low-dimensional projection of a bag-of-contexts embedding!  By the Johnson-Lindenstrauss lemma, a 300-dimensional projection is plenty big enough to preserve the distances between 3 million points with a small distortion factor; and of course it preserves all linear relationships on the nose.

Perhaps this point of view gives some insight into which kind of word relationships manifest as linear relationships in word2vec.  “flock:birds” is an interesting example.  If you imagine “group of things” as a context, you can maybe imagine word2vec picking this up.  But actually, it doesn’t do well:

>> model.most_similar(positive=[‘fish’,’flock’],negative=[‘birds’])
[(u’crays’, 0.4601619839668274), (u’threadfin_salmon’, 0.4553075134754181), (u’spear_fishers’, 0.44864755868911743), (u’slab_crappies’, 0.4483765661716461), (u’flocked’, 0.44473177194595337), (u’Siltcoos_Lake’, 0.4429660737514496), (u’flounder’, 0.4414420425891876), (u’catfish’, 0.4413948059082031), (u’yellowtail_snappers’, 0.4410281181335449), (u’sockeyes’, 0.4395104944705963)]

>> model.most_similar(positive=[‘dogs’,’flock’],negative=[‘birds’])
[(u’dog’, 0.5390862226486206), (u’pooches’, 0.5000904202461243), (u’Eminem_Darth_Vader’, 0.48777419328689575), (u’Labrador_Retrievers’, 0.4792211949825287), (u’canines’, 0.4766522943973541), (u’barked_incessantly’, 0.4709487557411194), (u’Rottweilers_pit_bulls’, 0.4708423614501953), (u’labradoodles’, 0.47032350301742554), (u’rottweilers’, 0.46935927867889404), (u’forbidding_trespassers’, 0.4649636149406433)]

The answers “school” and “pack” don’t appear here.  Part of this, of course, is that “flock,” “school”, and “pack” all have interfering alternate meanings.  But part of it is that the analogy really rests on information about contexts in which the words “flock” and “birds” both appear.  In particular, in a short text window featuring both words, you are going to see a huge spike of “of” appearing right after flock and right before birds.  A statement of the form “flock is to birds as X is to Y” can’t be true unless “X of Y” actually shows up in the corpus a lot.

Challenge problem:  Can you make word2vec do a good job with relations like “flock:birds”?  As I said above, I wouldn’t have been shocked if this had actually worked out of the box, so maybe there’s some minor tweak that makes it work.

Boys’ names, girls’ names

Back to gender-flipping.  What’s the “male version” of the name “Jennifer”?

There are various ways one can do this.  If you use the analogy engine from word2vec, finding the closest word to “Jennifer” + “he” – “she”, you get as your top 5:

David, Jason, Brian, Kevin, Chris

>>> model.most_similar(positive=[‘Jennifer’,’he’],negative=[‘she’])
[(u’David’, 0.6693146228790283), (u’Jason’, 0.6635637283325195), (u’Brian’, 0.6586753129959106), (u’Kevin’, 0.6520106792449951), (u’Chris’, 0.6505492925643921), (u’Mark’, 0.6491551995277405), (u’Matt’, 0.6386727094650269), (u’Daniel’, 0.6294828057289124), (u’Greg’, 0.6267883777618408), (u’Jeff’, 0.6265031099319458)]

But there’s another way:  you can look at the words closest to “Jennifer” (which are essentially all first names) and pick out the ones which are closer to “he” than to “she”.  This gives

Matthew, Jeffrey, Jason, Jesse, Joshua.

>>> [x[0] for x in model.most_similar(‘Jennifer’,topn=2000) if model.similarity(x[0],’he’) > model.similarity(x[0],’she’)]
[u’Matthew’, u’Jeffrey’, u’Jason’, u’Jesse’, u’Joshua’, u’Evan’, u’Brian’, u’Cory’, u’Justin’, u’Shawn’, u’Darrin’, u’David’, u’Chris’, u’Kevin’, u’3/dh’, u’Christopher’, u’Corey’, u’Derek’, u’Alex’, u’Matt’, u’Jeremy’, u’Jeff’, u’Greg’, u’Timothy’, u’Eric’, u’Daniel’, u’Wyvonne’, u’Joel’, u’Chirstopher’, u’Mark’, u’Jonathon’]

Which is a better list of “male analogues of Jennifer?”  Matthew is certainly closer to Jennifer in word2vec distance:

>>> model.similarity(‘Jennifer’,’Matthew’)

0.61308109388608356

>>> model.similarity(‘Jennifer’,’David’)

0.56257556538528708

But, for whatever, reason, “David” is coded as much more strongly male than “Matthew” is; that is, it is closer to “he” – “she”.  (The same is true for “man” – “woman”.)  So “Matthew” doesn’t score high in the first list, which rates names by a combination of how male-context they are and how Jennifery they are.  A quick visit to NameVoyager shows that Matthew and Jennifer both peaked sharply in the 1970s; David, on the other hand, has a much longer range of popularity and was biggest in the 1950s.

Let’s do it again, for Susan.  The two methods give

David, Robert, Mark, Richard, John

Robert, Jeffrey, Richard, David, Kenneth

And for Edith:

Ernest, Edwin, Alfred, Arthur, Bert

Ernest, Harold, Alfred, Bert, Arthur

Pretty good agreement!  And you can see that, in each case, the selected names are “cultural matches” to the starting name.

Sidenote:  In a way it would be more natural to project wordspace down to the orthocomplement of “he” – “she” and find the nearest neighbor to “Susan” after that projection; that’s like, which word is closest to “Susan” if you ignore the contribution of the “he” – “she” direction.  This is the operation Ben Schmidt calls “vector rejection” in his excellent post about his word2vec model trained on student evaluations.  

If you do that, you get “Deborah.”  In other words, those two names are similar in so many contextual ways that they remain nearest neighbors even after we “remove the contribution of gender.”  A better way to say it is that the orthogonal projection doesn’t really remove the contribution of gender in toto.  It would be interesting to understand what kind of linear projections actually make it hard to distinguish male surnames from female ones.

Google News is a big enough database that this works on non-English names, too.  The male “Sylvie”, depending on which protocol you pick, is

Alain, Philippe, Serge, Andre, Jean-Francois

or

Jean-Francois, Francois, Stephane, Alain, Andre

The male “Kyoko” is

Kenji, Tomohiko, Nobuhiro, Kazuo, Hiroshi

or

Satoshi, Takayuki, Yosuke, Michio, Noboru

French and Japanese speakers are encouraged to weigh in about which list is better!

Update:  Even a little more messing around with “changing the gender of words” in a followup post.

Tagged , , , , , ,

Devil math!

The Chinese edition of How Not To Be Wrongpublished by CITAC and translated by Xiaorui Hu, comes out in a couple of weeks.

ChineseCover

The Chinese title is

魔鬼数学

or

“Mo gui shu xue”

which means “Devil mathematics”!  Are they saying I’m evil?  Apparently not.  My Chinese informants tell me that in this context “Mo gui” should be read as “magical/powerful and to some extent to be feared” but not necessarily evil.

One thing I learned from researching this is that the Mogwai from Gremlins are just transliterated “Mo gui”!  So don’t let my book get wet, and definitely don’t read it after midnight.

Tagged , , ,

Sure as roses

I learned when I was writing this piece a few months ago that the New York Times styleguide doesn’t permit “fun as hell.”  So I had a problem while writing yesterday’s article about Common Core, and its ongoing replacement by an identical set of standards with a different name.  I wanted to say I was “sure as hell” not going to use the traditional addition algorithm for a problem better served by another method.  So instead I wrote “sure as roses.”  Doesn’t that sound like an actual folksy “sure as hell” substitute?  But actually I made it up.  I think it works, though.  Maybe it’ll catch on.

Tagged ,

Translator’s notes

The Brazilian edition of How Not To Be Wrong, with its beautiful cover, just showed up at my house.  One of the interesting things about leafing through it is reading the translator’s notes, which provide explanations for words and phrases that will be mysterious to Brazilian readers.  E.G.:

  • yeshiva
  • Purim
  • NCAA
  • Affordable Care Act
  • Rube Goldberg
  • home run
  • The Tea Party (identified by the translator as “radical wing of the Republican party”
  • “likely voters” — translator notes that “in the United States, voting is not obligatory”
  • home run (again!)
  • RBI (charmingly explained as “run battled in”)

I am also proud to have produced, on two separate occasions, a “trocadilho intraduzivel do ingles” (untranslatable English pun)

 

Tagged ,

William Giraldi cares only for beauty

Erin Clune, the feistiest blogger in Madison, goes off very satisfyingly on William Giraldi, who wrote in the Baffler about drinking away his paternity leave while his wife took care of their kid.  Not in a “why am I such a worthless loser” kind of way.  More of a “paternity leave is a scam because being a dad isn’t actually any work” kind of a way.  Given his feelings, he doesn’t quite get why paternity leave exists, but he’s pretty sure it’s a scam, perpetrated by, you know, this kind of person:

 I instantly pictured a phalanx of ultra-modern men parading down Commonwealth Avenue, jabbing placards that read “It’s My Seed, So Give Me Leave,” or some such slogan.

But never fear — William Giraldi is not one of those men!  He is a real man.  He knows what it’s all about.  In another reflection on new fatherhood, he writes:

My best friend, a Boston story writer, married an Irish Catholic woman from Connecticut with two siblings, an older and younger brother, neither of whom she adored, and so now the diaper work and up-all-night obligations get split down the middle. Furthermore, his bride aspires to be a novelist of all things. His hair has gone grayer, and all those short stories canistered in his cranium stay in his cranium. I, on the other hand, married an Asian woman born in Taiwan who has an identical twin and three other siblings—two of them younger, adored brothers she tended to daily—and although she’s an artist with an aptitude that astonishes me— Katie crafted the mobiles above Ethan’s crib; they rotate and revolve with a perfection that would have impressed Johannes Kepler himself—all she ever wanted to be was a mother.

novelist of all things!  Didn’t she get the memo from her vagina that she wasn’t supposed to make art anymore?  Or, if she did, that it should be for kids only?  I’ll bet her novel totally sucks compared to Katie’s awesome mobiles.  I’ll bet Kepler would not have been impressed with her novel at all.  Taiwan, man, that’s where women are women.  Which reminds me of an even more charming turn in this essay:

The birthing staff at Beth Israel: Nurse Linda and Nurse Sara, seraphs the both of them; Doctor Yum—Doctor Yummy—the preternaturally beautiful doctor on call (because our own preternaturally beautiful doctor was in Greece on a date (Ethan arrived two weeks early); and one other nurse who entered stage left rather late in the act.

Yep — Giraldi takes a little break to note the hotness of the Asian woman who’s in the process of delivering his child.

Yummy!

But what do I know?  I’m a feminist and an academic.  Giraldi doesn’t have much use for my kind.  Here he goes again, in the Virginia Quarterly review complaining about smelly English professors and their theories:

These are politicizers who marshal literature in the name of an ideological agenda, who deface great books and rather prefer bad books because they bolster grievances born of their epidermis or gender or sexuality, or of the nation’s economy, or of cultural history, or of whatever manner of apprehension is currently in vogue.

But not William Giraldi!  He is not one of those smelly people.  He has no ideology, or if he does, he manfully wrestles it into submission because he is interested only in beauty.  Of books, of Asian ob/gyns, whatever.  That bit above is followed by many many paragraphs of complaint, which I can’t quite bring myself to reproduce.  But you can read it yourself, or just cut and paste a few dozen randomly chosen sentences from any book about “political correctness” or “tenured radicals” written between 1990 and 1995, and you’ll get the general idea.

What really bugs Giraldi is that academics, in his view, can’t write.

But all too often you’ll be assailed by such shibboleths as historicize, canonicity, disciplinization, relationality, individuated, aggressivity, supererogatory, ethicalization, and verticality before you are mugged by talk of affective labor, gendered schema, sociably minded animism, the rhetorical orientation of a socially responsive and practical pedagogy, historical phenomenology of literariness, associationist psychology, hermeneutic procedures, the autonominization of art, an idiolect of personal affection, the hierarchy of munificent genius, and textual transactions, and then you’ll be insulted by such quotidian clichés as speak volumes, love-hate relationship, the long haul, short shrift, mixed feelings, and playing dumb.  Why the needless redundancy “binding together”? Have you ever tried to bind something apart?

No, but then again, I’ve never encountered a cliché that wasn’t quotidian, either.  As for “bound together,” it’s good enough for the Bible, which suggests that no man put asunder what God has etc.  (“Joined together” is a more common rendering, but you can’t join things apart either.)  All this stuff about quotidian cliché is a bit rich, anyway, from a guy who called somebody’s second novel a “sophomore effort.”

Those technical terms, well, some of them I know what they mean:  “affective labor” is a real thing which as far as I know has no other short name, and “canonicity” means “the condition of being canonical” — would Giraldi really prefer “canonicalness”?  “Idiolect” is a handsome and useful word too.

But I don’t think Giraldi cares that much whether a word is handsome, or expresses a piece of meaning precisely and swiftly, because here’s the thing:  William Giraldi is a terrible, terrible writer.  Some special, willful deafness to the music of English is needed to have written “epidermis” in that first paragraph above.  Giraldi mentions “the significant struggle every good writer goes through in order to arrive at le mot juste,” but his own struggle always seems to end with a word he can admire himself for having typed.  It is not the same thing.  Again and again, until it kinds of hurts to read, he goes for the cheap ornament.  His wife doesn’t make mobiles, she “crafts” them.  His friend’s stories aren’t in his head, they’re in his “cranium.” It is not an apprehension that’s in vogue, or even a kind of apprehension, but a manner of apprehension.  In that book review I mentioned, he refers to the title of the book, I kid you not, as its “moniker.”  Better a hundred “gendered schemas” than launching a paragraph with “There has been much recent parley, in these pages and elsewhere…”

Reading Giraldi’s prose feels like sitting in an extra-fancy bathroom, with black and white tiles and gold trim everywhere and a fur-lined toilet, and no windows, into which someone has just sprayed a perfume whose label identifies it as “woodland fresh.”  Or like listening to William F. Buckley on an off day.  Or like listening to William F. Buckley on an off day in that bathroom.

Giraldi closes his book review with a reminder of “the moral obligation to write well, to choose self-assertion over mere self-expression, to raise words above the enervated ruck and make the world anew.”  (So that’s what’s wrong with my ruck — it’s enervated!)

Look, I’m on board.  But you have to actually do it, not make gaudy gestures in the direction of doing it.  He should have looked at his essays with a slow cold eye and thrown out everything that did no work.  It takes time and it’s not fun and it doesn’t help you settle your scores.  But writing well requires it.  Maybe that’s how he should have spent his paternity leave.

 

 

 

 

Tagged , , ,

I hate bad buts and I cannot lie

From today’s New York Times:

Scarlett Johansson gainfully posed in underwear and spiked heels for Esquire’s cover last year after the magazine named her the “sexiest woman alive.” But a French novelist’s fictional depiction of a look-alike so angered the film star that she sued the best-selling author for defamation.

The inappropriate “but” is one of the sneakiest rhetorical tricks there is.  It presents the second sentence as somehow contrasting with the first.  It isn’t.  Scarlett Johansson agreed to be photographed mostly undressed; does that make it strange or incongruous or hypocritical that she doesn’t want to be lied about in print?  It does not.  To be honest, I can’t think of any explanation other than weird retrograde sexism for writing the lede this way.  “She got paid for looking all sexy, so who is she to complain that she was defamed?”  Patricia Cohen of the New York Times, I’m awarding you anWonderWomanHellNo

Tagged , , ,

Mathematical progress, artistic progress, local-to-global

I like this post by Peli Grietzer, which asks (and I oversimplify:)  when we say art is good, are we talking about the way it reflects or illuminates some aspect of our being, or are we talking about the way it wins the culture game?  And Peli finds help navigating this problem from an unexpected source:  Terry Tao’s description of the simultaneously local and global nature of mathematical progress.  Two friends of Quomodocumque coming together!  Unexcerptable, really, so click through if you like this kind of stuff.

Tagged , , ,

Reader survey: how do you say “asked”?

One more note on the subject of “Do I actually speak English?” I learned from reading How Not To Be Wrong aloud that, even when I’m speaking slowly and carefully, I pronounce the word “asked” as “ast.”  (At least, that’s my preferred transcription; I concede that “assed” might be more faithful.)  Is that what all native English speakers do, or is it a regionalism?

Hmm, this post from the invaluable englishforums.com has a description that matches what I do very closely:

“asked” is not pronounced /ast/, although it may seem that the ‘k’ is missing when you hear it.
By placing your jaw, teeth, tongue, etc. in the proper position for saying the ‘k’ you can create a sort of pause at the point where the ‘k’ occurs. This makes it sound different from /ast/, even if the ‘k’ is only present in a sort of hidden way (no release or aspiration of the ‘k’). Pronounce /ask/, stopping in the ‘ready-position’ for saying the ‘k’. But then, instead of finishing the ‘k’ sound, say a ‘t’ at the end!

See also.

And here’s a discussion in which the characters on How I Met Your Mother are separated into those who pronounce the k in “asked” and those who don’t.  (Only one does.)

How do you say “asked”?

 

 

Tagged

The way I am now

Inspired by this really wonderful Jody Rosen cut-up, made entirely of sentences written by David Brooks containing the phrase “we live,” I tried the same with my blog, using just sentences that assert something about the way I am.  Here’s what you get:

I am impressed by Biddy Martin’s political savvy.  I’m not sure I’ve ever read a book about cultural anthropology.  If I’d been born in New York, I might have been a Yankees fan, but luckily for me, I was born in Maryland, so I’m not.

I am away from my desk.

I am ahead of the curve on Carsick Cars.  I am pedantic about people’s Christmas cards.  I am not up to speed with modern methods of music consumption.  I am not the kind of guy who has opinions about DC hardcore.  Like everyone else, I am wildly cheering Peter Scholze’s new preprint.

I am not one of the most radical signatories to the “Cost of Knowledge” statement.  I’m not so sure.  So am I stuck?  I am not stuck!

Now, I am not a low-fat dude.  I’m a Jew married to a Jew.  I’m proud of Madison.  I’m wholeheartedly in favor of Barry Bonds.  And in that spirit of the early 1990s and inarticulate anxiety, I am listening to Veruca Salt.

Tagged , ,
Follow

Get every new post delivered to your Inbox.

Join 675 other followers

%d bloggers like this: