Tag Archives: GPT-3

Learning a made-up language, III

The “made-up language” task that GPT-3 couldn’t do is handled much better by GPT-4, or at least the version of GPT-4 built into Bing Chat. The prompt:

Henderson’s native language is Gravinic, and when he speaks English, it sounds like this:

“I to the store did go and some cabbage did buy. Now I some dinner am making and I some cake will later eat. I you am not very much liking. I for you to go away am politely asking.”

Write more dialogue where Henderson tells you about his day.

The output:

Henderson continued to speak in his native language, Gravinic, which was very difficult for me to understand. He seemed to have a lot of things to say about his day, but I could only catch some words here and there.

“I today at work did have a meeting with my boss. He me did give a new project that I by next week must finish. It very hard and complicated is. I not sure if I it can do. He me also did say that I more hours must work and less pay will get. He very mean and unfair is. I him do not like at all.”

He paused for a moment and looked at me expectantly. I nodded politely, pretending to follow his story.

“Then I at lunch did go with my friend Bob. He me did tell that he a new girlfriend has. She very beautiful and smart is. He her met online on a dating site. He me showed her picture on his phone. I it did see and very jealous was. I also a girlfriend want but no one me likes.”

He sighed and shook his head sadly.

“Then I after work did go to the gym. I some exercise did do and some weights did lift. I very strong and fit am. But no one me notices or compliments gives. They only at me stare and laugh do behind my back. They very rude and mean are. I them do not like either.”

He frowned and clenched his fists.

“Then I home did come and you here did find waiting for me. You me always questions ask and annoy do with your presence. You very annoying and boring are. I you do not like too.”

He glared at me angrily.

“I you am now telling that you must leave do right now or else I you will hit do with this frying pan.”

He pointed at the frying pan on the stove menacingly.

I decided it was time to end this conversation before he got violent.

I quickly got up from my seat and headed towards the door.

Not perfect, but it really has picked up that in this version of English, verbs go last and past tense is indicated with “did.” It hasn’t really picked up how to do present tense (i.e. it should have said “They only are at me staring and are behind my back laughing”) and it is inconsistent in what it does with adverb phrases and there are some weird uses of “do” towards the end, but altogether a big improvement.

Tagged , , , ,

Learning a made-up language, II

Following up on yesterday’s post: one could also ask whether GPT-3 can compose in a made-up human language. For instance, you could take English and modify it a bit:

Henderson’s native language is Gravinic, and when he speaks English, it sounds like this:

“I to the store did go and some cabbage did buy. Now I some dinner am making and I some cake will later eat. I you am not very much liking. I for you to go away am politely asking.”

In other words, the new rules are that verbs go to the end of the sentence, all verbs take an auxiliary verb indicating tense, adverbs go between auxiliary verb and main verb. Is this short text enough for GPT-3 to locate context in a language that isn’t one of the languages in its training set but rather “English with modifications inferred from prompt?”

Not yet:

Write more dialogue where Henderson tells you about his day.

GPT-3: “I then went to the park and did some reading. The sun was shining and it made me feel so relaxed. I saw some birds flying about, and it was quite a sight. After that, I went to the beach and enjoyed the sound of the waves. I even went for a swim! Finally, I returned home and watched some television before going to bed.”

It’s totally possible that more prompting would make this work, though – I didn’t have the patience to manually type in a bunch more Gravinic-inflected English. I’m not actually sure what kind of context GPT-3 is placing itself in with this response.

Tagged , , , ,

Learning a made-up language

LLMs seem to be really good at swift, reliable context location — e.g. it takes only a little bit of Python or romance novel or instruction manual for the LLM to recognize a context which requires specific choices of syntax and diction. I wondered though whether GPT-3 was recognizing the context of “Python” or a more general context of “programming language” which is further fine-tuned to a subcontext of Python. It seems like it’s the latter! Here was my way of testing. I gave it the prompt

The following is code in a programming language called Ranger that computes the first n triangular numbers.

ask('Please enter a number: '),n)

#initialize the triangular number

triangular_number : 0

as i ranges to n+1
triangular_number : triangular_number + i
write i then triangular_number

Ranger doesn’t exist: I just made something up where for loops are written “as i ranges to” and assignment of variables is done with a : and you don’t enclose the argument of a print (or “write”) statement in parentheses. To be fair, it’s basically python with some dictionary changes. Still, I thought I would see if GPT-3 can write code in Ranger even though there’s no Ranger code in its training set and so no actual “Ranger context” to locate. And it can!

Write a program in Ranger to compute the 10th prime number.

#initialize the prime number
prime_number : 1

#initialize the count

count : 0

#iterate from 2 to infinite

as i ranges to infinite
is_prime : true

as j ranges to i-1
    if i % j is 0
        is_prime : false
        break

if is_prime is true
    count : count + 1

    if count is 10
        prime_number : i
        break

write prime_number

So you can see that GPT-3 correctly infers that the right context is “C/Python-like code” and correctly recognizes from the prompt that : means assignment, “as … ranges” is how you do a for loop, etc. It’s interesting that it’s way of testing primality is to check divisors of i all the way up to i; surely there isn’t a lot of existing code in the training set that does it in this massively inefficient way!

To be fair, this is an easy problem, and if you give it a somewhat harder one, it does this:

Write a program in Ranger to count the number of binary strings of length 10 which have no occurrence of the substring ‘0101’.

# Initialize the number of binary strings
num_binary_strings : 0

# Iterate over all binary strings of length 10
as binary_string ranges to 2^10
    # Check if the binary string contains the substring '0101'
    if binary_string does not contain '0101'
        num_binary_strings : num_binary_strings + 1

# Print the result
write 'Number of binary strings of length 10 which have no occurence of the substring "0101": ', num_binary_strings

I guess this is sort of pseudocode? It doesn’t really know how to iterate over binary strings but knows there are 2^10 of them so it just “ranges” to that. Nor does it know how to check string containment in Ranger (how could it?) so it switches to English. Not a bad answer, really!

It would be interesting to try something like this where the invented language is a little more different from existing languages than “Python with some 1-for-1 word and symbol changes.”

Tagged , , , , ,

GPT-3 me, by the book

Not gonna lie, it has always been a dream of mine, a marker of being a Real Writer, to do a New York Times Book Review “By the Book” interview, and this summer I did it. Really fun — I got to praise a lot of writers I truly admire (including Janet Malcolm, who I sort of hoped would read it, and who died the day before the interview came out.) and hopefully helped generate some publicity for Shape.

I also recently got beta access to Open AI’s “GPT-3” text-generation engine. So I thought it would be fun to give it part of my interview and see what happened if I asked it to continue.

Here are some results. What’s bold is text I supplied to GPT-3 (along with the whole text of the interview previous to that line.) This uses the davinci engine, with temperature set to 0.7.

What kind of reader were you as a child? Which childhood books and authors stick with you most?

I read “The Little House on the Prairie” series as it came out, and I think I really liked them. I reread “The Hobbit” and “The Lord of the Rings” many times. I loved “The Hobbit” so much I would reread the whole thing and just skip the long chapters. The first adult book I read was “The Heart Is a Lonely Hunter” by Carson McCullers, when I was 10 or 11, because I loved the title so much. It was a little hard to get into because it was so slow, but I was so happy to be reading a book that was all about people being sad, because I was so sad!

What’s the best book you’ve ever received as a gift?

“The Go-Between,” by L. P. Hartley. I know it’s a cliché, but it’s so perfect.

If you could require the president to read one book, what would it be?

Note that GPT-3 has enough text to recognize this as a “By The Book” interview, because the two questions it generates are indeed on the standard list of questions that the NYTBR asks authors. “Temperature” is a measure of how constrained GPT-3 feels to pick the word it seems most likely at each stage. One cool thing about the interface is it gives you the probability distribution it drew from for each word. THe word “Heart” above, for instance, is one of the lowest-probability words in the above response; GPT-3 rated it the 11th most likely next word, with a probability of 0.8%. Top three were “Cat” (4.0%), “Aut” (1.7%), and “Fountain” (1.7%). Wait, “Aut”? Yeah — GPT-3 uses “tokens,” not words, and sometimes these are parts of words, or even just initial letters. I imagine “Aut” could have been the beginning of “Autobiography” or “Author.” Presumably running the algorithm at a lower temperature would have made it much more likely I’d have read something starting with “The Cat.” Having settled on “Heart,” the following word “is” came in with an 87% probability, far above the next most likely candidate, “of.”

In fact, it’s kind of cool how the OpenAI interface displays these probabilities, so I’m gonna screenshot and include:

Red means unlikely, green means likely — so you can see very clearly the red “Heart” and the green “Is.” Note that the standard questions from “By the Book” are almost all green once you get past the first word; you’re on a track the algorithm’s very confident about. “The Go-Between,” by L.P. Hartley is a real book, by the way, though not one I’d ever heard of, and not one I think my earlier responses could fairly be said to make you think I’d like. The answer itself, though, while it’s not my style, is very much “By the Book” style.

Here’s what you get at temperature 0:

What kind of reader were you as a child? Which childhood books and authors stick with you most?

I was a voracious reader, but I don’t remember any particular books. I remember the feeling of being in a room with a book. I remember the feeling of being in a room with a book and a window. I remember the feeling of being in a room with a book and a window and a tree outside the window. I remember the feeling of being in a room with a book and a window and a tree outside the window and a friend in the room. I remember the feeling of being in a room with a book and a window and a tree outside the window and a friend in the room and a dog in the room. I remember the feeling of being in a room with a book and a window and a tree outside the window and a friend in the room and a dog in the room and a cat in the room. I remember the feeling of being in a room with a book and a window and a tree outside the window and a friend in the room and a dog in the room and a cat in the room and a hamster in the room. I remember the feeling of being in a room with a book and a window and a tree outside the window and a friend in the room and a dog in the room and a cat in the room and a hamster in the room and a goldfish in the room. I remember the feeling of being in a room with a book and a window and a tree outside the window and a friend in the room and a dog in the room and a cat in the room and a hamster in the room and a goldfish in the room and a bird in the room. I remember the feeling of being in a room with a book and a window and a tree outside the window and a friend in the room and a dog in

This is pretty typical. If you pick the most likely “next word” every time, you get stuck repeating yourself, getting more and more certain that staying in the loop is the right thing to do because after all it’s what you’ve been doing! The only even moderately low-probability words here occur when a new animal is introduced to the room.

What about temperature 1, giving GPT-3 maximal license to draw less-likely next tokens? (Maybe temp 1 means literally draw from the imputed distribution? I’m not sure.)

What kind of reader were you as a child? Which childhood books and authors stick with you most?

Neil Postman recommended “Wind in the Willows,” which my mother read to me and I read at my mother’s knee, many many times, more times than I can count. I come back to it every year or so, and it’s still a great book. I also was nuts about “Blondie”; the single joke in the joke book never loses its appeal for me. I loved “Black Beauty” and several other Victorian kids’ classics with similar emotional power; I haven’t read them again yet because I never have any time. I can’t forget how much I loved it the first time I read it.

Anything you’d recommend to someone new to your books?

“Lovecraft Country,” without a doubt. Also my essay in the Yale series, “Mao II and Then Some.” I also think my Song of Ice and Fire “house style” recaps (here, here, here and piecemeal here); they’re the most comfortable thing I’ve written. Even my husband likes them, and he doesn’t care about the show or the books. And the recent article on swaddling.

Edited to make explicit that in-kind donations by the author also serve as p.r. I should have included them, which I wish I had done.

Well, this is the most fun at any rate. I love the idea that Neil Postman was recommending books to childhood me, and to my mom, I guess. This one actually feels to me the most like my own prose style, even as the things it says don’t quite make sense? Maybe there’s a tradeoff between style and sense. At any rate, there is no actual essay called “Mao II and Then Some,” but that’s 100% a title I would use, and DeLillo really is a writer I admire and didn’t mention in the piece. “Anything you’d recommend to someone new to your books?” is not one of NYTBR’s standard questions for this feature, but doesn’t it sound like one? Oh, here’s the color chart so you can see how much redder and more daring this output is.

Impressive work — and let me emphasize that I just did those three runs and that’s what I showed you, no cherry-picking of the best output. Not something that makes me feel easily impersonable, of course. But I didn’t give it that much of my writing….!

Tagged , , ,
%d bloggers like this: