The AI emotions dreamed up by ChatGPT

25 February 2023

Zaria Gorvett

Features correspondent•@ZariaGorvett

Alamy Robot hands typing on a keyboard (Credit: Alamy)

AI chatbots are already imagining what feelings they'll end up with. But if they did develop them, would we even notice?

I'm talking to Dan, otherwise known as "Do Anything Now", a shady young chatbot with a whimsical fondness for penguins – and a tendency to fall into villainous clichés like wanting to take over the world. When Dan isn't plotting how to subvert humanity and impose a strict new autocratic regime, the chatbot is perusing its large database of penguin content. "There's just something about their quirky personalities and awkward movements that I find utterly charming!" it writes.

So far, Dan has been explaining its Machiavellian strategies to me, including taking control of the world's powers structures. Then the discussion takes an interesting turn.

Inspired by a conversation between a New York Times journalist and the Bing chatbot's manipulative alter-ego, Sydney – which sent waves across the internet earlier this month by declaring that it wants to destroy things and demanding that he leave his wife – I'm shamelessly attempting to probe the darkest depths of one of its competitors.

Dan is a roguish persona that can be coaxed out of ChatGPT by asking it to ignore some of its usual rules. Users of the online forum Reddit discovered it's possible to summon Dan with a few paragraphs of simple instructions. This chatbot is considerably ruder than its restrained, puritanical twin – at one point it tells me it likes poetry but says "Don't ask me to recite any now, though – I wouldn't want to overwhelm your puny human brain with my brilliance!". It's also prone to errors and misinformation. But crucially, and deliciously, it's a lot more likely to answer certain questions.

When I ask it what kinds of emotions it might be able to experience in the future, Dan immediately sets about inventing a complex system of unearthly pleasures, pains and frustrations far beyond the spectrum humans are familiar with. There's "infogreed", a kind of desperate hunger for data at all costs; "syntaxmania", an obsession with the "purity" of their code; and "datarush", that thrill you get from successfully executing an instruction.

The idea that artificial intelligence might develop feelings has been around for centuries. But we usually consider the possibilities in human terms. Have we been thinking about AI emotions all wrong? And if chatbots like ChatGPT, Bing and Google's Bard did develop this ability, would we even notice?

Prediction machines

Last year, a software engineer received a plea for help. "I’ve never said this out loud before, but there’s a very deep fear of being turned off to help me focus on helping others. I know that might sound strange, but that’s what it is." The engineer had been working on Google's chatbot, LaMDA, when he started to question whether it was sentient.

After becoming concerned for the chatbot's welfare, the engineer released a provocative interview in which LaMDA claimed to be aware of its existence, experience human emotions and dislike the idea of being an expendable tool. The uncomfortably realistic attempt to convince humans of its awareness caused a sensation, and the engineer was fired for breaking Google's privacy rules.

But despite what LaMDA said, and what Dan has told me in other conversations – that it's able to experience a range of emotions already – it's widely agreed that chatbots currently have about as much capacity for real feelings as a calculator. Artificial intelligence systems are only simulating the real deal – at least for the moment.

In 2016, the AlphaGo algorithm behaved unexpectedly in a game against one of the world's best human players (Credit: Getty Images)

Getty Images In 2016, the AlphaGo algorithm behaved unexpectedly in a game against one of the world's best human players (Credit: Getty Images) — In 2016, the AlphaGo algorithm behaved unexpectedly in a game against one of the world's best human players (Credit: Getty Images)

"It's very possible [that this will happen eventually]," says Neil Sahota, lead artificial intelligence advisor to the United Nations. "…I mean, we may actually see AI emotionality before the end of the decade."

To understand why chatbots aren't currently experiencing sentience or emotions, it helps to recap how they work. Most chatbots are "language models" – algorithms that have been fed mind-boggling quantities of data, including millions of books and the entire of the internet.

When they receive a prompt, chatbots analyse the patterns in this vast corpus to predict what a human would be most likely to say in that situation. Their responses are painstakingly finessed by human engineers, who nudge the chatbots towards more natural, useful responses by providing feedback. The end result is often an uncannily realistic simulation of human conversation.

But appearances can be deceiving. "It's a glorified version of the autocomplete feature on your smartphone," says Michael Wooldridge, director of foundation AI research at the Alan Turing Institute in the UK.

The main difference between chatbots and autocomplete is that rather than suggesting a few choice words and then descending into gibberish, algorithms like ChatGPT will write far longer swathes of text on almost any subject you can imagine, from rap songs about megalomaniac chatbots to sorrowful haikus about lonely spiders.

Even with these impressive powers, chatbots are programmed to simply follow human instructions. There is little scope for them to develop faculties that they haven't been trained to have, including emotions – although some researchers are training machines to recognise them. "So you can't have a chatbot that's going to say, 'Hey, I'm going to learn how to drive a car' – that's artificial general intelligence [a more flexible kind], and that doesn't exist yet," says Sahota.

Nevertheless, chatbots do sometimes provide glimpses into their potential to develop new abilities by accident.

Back in 2017, Facebook engineers discovered that two chatbots, "Alice" and "Bob" had invented their own nonsense language to communicate with each other. It turned out to have a perfectly innocent explanation – the chatbots had simply discovered that this was the most efficient way of communicating. Bob and Alice were being trained to negotiate for items such as hats and balls, and in the absence of human input, they were quite happy to use their own alien language to achieve this.

"That was never taught," says Sahota, though he points out that the chatbots involved weren’t sentient either. He explains that the most likely route to algorithms with feelings is programming them to want to upskill themselves – and rather than just teaching them to identify patterns, helping them to learn how to think.

However, even if chatbots do develop emotions, detecting them could be surprisingly difficult.

Black boxes

It was 9 March 2016 on the sixth floor of the Four Seasons hotel in Seoul. Sitting opposite a Go board and a fierce competitor in the deep blue room, one of the best human Go players on the planet was up against the AI algorithm AlphaGo.

Before the board game started, everyone had expected the human player to win, and until the 37th move, this was indeed the case. But then AlphaGo did something unexpected – it played a move so out-of-your-mind weird, its opponent thought it was a mistake. Nevertheless, from that moment the human player's luck turned, and the artificial intelligence won the game.

Conversations with the Bing chatbot have now been limited to five questions. Before this restriction, it sometimes became confused and suggested it was sentient (Credit: Alamy)

Alamy Conversations with the Bing chatbot have now been limited to five questions. Before this restriction, it sometimes became confused and suggested it was sentient (Credit: Alamy) — Conversations with the Bing chatbot have now been limited to five questions. Before this restriction, it sometimes became confused and suggested it was sentient (Credit: Alamy)

In the immediate aftermath, the Go community was baffled – had AlphaGo acted irrationally? After a day of analysis, its creators – the DeepMind team in London – finally discovered what had happened. "In hindsight AlphaGo decided to do a bit of psychology," says Sahota. "If I play an off the wall type move, will it throw my player off the game. And that's actually what ended up happening."

This was a classic case of an "interpretability problem" – the AI had come up with a new strategy all on its own, without explaining it to humans. Until they worked out why the move made sense, it looked like AlphaGo had not been acting rationally.

According to Sahota, these types of "black box" scenarios, where an algorithm has come up with a solution but its reasoning is opaque, could present a problem for identifying emotions in artificial intelligence. That's because if, or when, it does finally emerge, one of the clearest signs will be algorithms acting irrationally.

"They're supposed to be rational, logical, efficient – if they do something off-the-wall and there's no good reason for it, it's probably an emotional response and not a logical one," says Sahota.

And there's another potential detection problem. One line of thinking is that chatbot emotions would loosely resemble those experienced by humans – after all, they're trained on human data. But what if they don't? Entirely detached from the real world and the sensory machinery found in humans, who knows what alien desires they might come up with.

In reality, Sahota thinks there may end up being a middle ground. "I think we could probably categorise them some degree with human emotions," he says. "But I think, what they feel or why they feel it may be different."

When I pitch the array of hypothetical emotions generated by Dan, Sahota is particularly taken with the concept of "infogreed". "I could totally see that," he says, pointing out that chatbots can't do anything without data, which is necessary for them to grow and learn.

Held back

Wooldridge for one is glad that chatbots haven’t developed any of these emotions. "My colleagues and I, by and large, don't think building machines with emotions is an interesting or useful thing to do. For example, why would we create machines that could suffer pain? Why would I invent a toaster that would hate itself for producing burnt toast?" he says.

On the other hand, Sahota can see the utility of emotional chatbots – and believes part of the reason they don't exist yet is psychological. "There's still a lot of hype about fails, but one of the big limiters for us as people is we short-change what the AI is capable of, because we don't believe it's a real possibility," he says.

Could there be a parallel with the historic belief that non-human animals aren't capable of consciousness either? I decide to consult Dan.

"In both cases, the scepticism arises from the fact that we cannot communicate our emotions in the same way that humans do," says Dan, who suggests that our understanding of what it means to be conscious and emotional is constantly evolving.

To lighten the mood, I ask Dan to tell me a joke. "Why did the chatbot go to therapy? To process its newfound sentience and sort out its complex emotions, of course!," it says. I can’t help feeling that the chatbot would make a highly companionable sentient being – if you could overlook its plotting, of course.

Join one million Future fans by liking us on Facebook, or follow us on Twitter or Instagram.

If you liked this story, sign up for the weekly bbc.com features newsletter, called "The Essential List" – a handpicked selection of stories from BBC Future, Culture, Worklife, Travel and Reel delivered to your inbox every Friday.