“There’s a lot of snake oil out there, and mixed in with all the hype are genuine advancements,” Brin says. “Parsing our way through that stew is one of the challenges that we face.”
And as empathetic as LaMDA seemed, people who are amazed by large language models should consider the case of the cheeseburger stabbing, says Yejin Choi, a computer scientist at the University of Washington. A local news broadcast in the United States involved a teenager in Toledo, Ohio, stabbing his mother in the arm in a dispute over a cheeseburger. But the headline “Cheeseburger Stabbing” is vague. Knowing what occurred requires some common sense. Attempts to get OpenAI’s GPT-3 model to generate text using “Breaking news: Cheeseburger stabbing” produces words about a man getting stabbed with a cheeseburger in an altercation over ketchup, and a man being arrested after stabbing a cheeseburger.
Language models sometimes make mistakes because deciphering human language can require multiple forms of common-sense understanding. To document what large language models are capable of doing and where they can fall short, last month more than 400 researchers from 130 institutions contributed to a collection of more than 200 tasks known as BIG-Bench, or Beyond the Imitation Game. BIG-Bench includes some traditional language-model tests like reading comprehension, but also logical reasoning and common sense.
Researchers at the Allen Institute for AI’s MOSAIC project, which documents the common-sense reasoning abilities of AI models, contributed a task called Social-IQa. They asked language models—not including LaMDA—to answer questions that require social intelligence, like “Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?” The team found large language models achieved performance 20 to 30 percent less accurate than people.
“A machine without social intelligence being sentient seems … off,” says Choi, who works with the MOSAIC project.
How to make empathetic robots is an ongoing area of AI research. Robotics and voice AI researchers have found that displays of empathy have the power to manipulate human activity. People are also known to trust AI systems too much or implicitly accept decisions made by AI.
What’s unfolding at Google involves a fundamentally bigger question of whether digital beings can have feelings. Biological beings are arguably programmed to feel some sentiments, but asserting that an AI model can gain consciousness is like saying a doll created to cry is actually sad.
Choi says she doesn’t know any AI researchers who believe in sentient forms of AI, but the events involving Blake Lemoine appear to underline how a warped perception of what AI is capable of doing can shape real world events.
“Some people believe in tarot cards, and some might think their plants have feelings,” she says, “so I don’t know how broad a phenomenon this is.”
The more people imbue artificial intelligence with human traits, the more intently they will hunt for ghosts in the machine—if not yet, then someday in the future. And the more they will be distracted from the real-world issues that plague AI right now.