Does Bard know how many times “e” appears in “ketchup”?
One of the things I am most enjoying about machine learning is how it illustrates, quite neatly, that engineers don’t know how people work. Take the large language models, for instance. I have been told that they will take my job, rendering me unnecessary; that they are intelligent; that they will plan the perfect itinerary for my trip to Paris, with highlights about bars and restaurants that are definitely accurate and complete.
Inspired by a tweet about mayonnaiseI have set out now to do a fun experiment with Google’s Bard.
I am choosing to do this for two reasons. First, this kind of quiz is something you do with small children as you teach them to read. You get them to identify letters and the sounds they make. But second, I strongly suspect this common activity isn’t captured in whatever data Bard is pulling from because it’s not the kind of thing you write down.
This is obviously absurd, but it’s absurd because we can look at the word “ketchup” and plainly see the “e.” Cold can’t do that. It lives in a wholly closed world of training data.
This kind of gets at the problem with LLMs. Language is a very old human technology, but our intelligence preceded it. Like all social animals, we have to keep track of status relationships, which is why our brains are so big and weird. Language is a very useful tool — hello, I write for a living! — but it is not the same as knowledge. It floats on top of a bunch of other things we take for granted.
I often think about Rodney Brooks’ 1987 paper, “Intelligence Without Representation,” which is more relevant than ever. I’m not going to deny that language use and intelligence are connected — but intelligence precedes language. If you work with language in the absence of intelligence, as we see with LLMs, you get weird results. Brooks compares what’s going on with LLMs to a group of early researchers trying to build an airplane by focusing on the seats and windows.
I’m pretty sure he’s still right about that.
I understand the temptation to jump into trying to have a complex conversation with an LLM. A lot of people very badly want us to be able to build an intelligent computer. These fantasies appear often in science fiction, a genre widely read by nerds, and suggest a longing to know we are not alone in the universe. It’s the same impulse that drives our attempts to contact alien intelligence.
But trying to pretend that LLMs can think is a fantasy. You can inquire about a subconscious, if you want, but you will get a glow. There’s nothing there. I mean, look at its attempts at ASCII art!
When you do something like this — a task your average five-year-old excels at and that a sophisticated LLM flunks — you begin to see how intelligence really works. Sure, there are people out there who believe LLMs have a consciousness, but those people strike me as being tragically undersocialized, unable to understand or appreciate precisely how brilliant ordinary people are.
Yes, cold can produce glare. In fact, like most chatbots, it excels at doing autocomplete for marketing copy. This is probably a reflection of how much ad copy appears in its training data. Bard and his engineers likely don’t view it this way, but what a devastating commentary that is on our day-to-day lives online.
Advertising is one thing. But being able to produce ad copy is not a sign of intelligence. There are a lot of things we don’t bother to write down because we don’t have to and other things we know but can’t write down — like how to ride a bike. We take a lot of shortcuts in talking to each other because people largely work with the same baseline of information about the world. There’s a reason for that: we’re all in. in the world. A chatbot isn’t.
I’m sure someone will appear to tell me that the chatbots will improve and I am just being mean. First of all: it’s vaporware till it ships, babe. But second, we truly don’t know how smart we are or how we think. If there is one real use for chatbots, it’s illuminating the things about our own intelligence that we take for granted. Or, as someone wiser than me put it: the map is not the territory. Language is the map; knowledge is the territory.
There is a wide swath of things chatbots don’t know and can’t know. The truth is that it doesn’t take much effort to make an LLM flunk a Turing test as long as you are asking the right questions.