Apr 58 min read

AI Gets Emotional

How do you feel about machines tracking your emotions? Imagine if your phone could sense that you’re feeling stressed, and it responded by recommending a meditation app. The rapid rise of emotion recognition technology means this could soon be a reality.

If AI-powered tech can learn to recognise and respond to our emotions, the potential will be huge. But so are the concerns.

And that’s before we get to the really big question: can this technology actually work?

Watch my short explainer to find out more:

Emotional machines 🤖

When the pandemic forced young people worldwide out of classrooms and into remote learning, schools in Hong Kong kept a closer eye on their pupils than most. They were among the world’s first to employ AI to read children’s emotions while they learned. As pupils logged on from home for lessons, their body language and expressions – down to micro-movements of facial muscles – were analysed by an emotion-reading AI called 4LittleTrees. This AI provided teachers with information about the emotional state, motivation, and focus levels of everyone in their class.

This technology, also known as affective computing, is built on the premise that by collecting tonnes of data on how people’s faces move, their tone of voice and word choices, AI can be trained to recognise how anyone is feeling.

Affectiva, one of the US tech firms which has led the way, believes the potential for computers to develop emotional intelligence could be revolutionary: ”It is the key to humanizing technology and bridging the gap between humans and machines.”

Advertisers, marketers and media companies have been early adopters, using emotional analysis to study audience reactions to ad campaigns, films, and TV programmes.

Call centres are getting in on the act, too. The LA-based tech firm Behavioral Signals says that its algorithm, which tracks features of speech including pitch, volume and intonation, can tell what a customer’s intentions are within the first 30 seconds of a call. “We focus on how something is being said,” chief executive Rana Gujral told New Scientist. “Oftentimes, we don’t even convert the audio into text.”

More inspiringly, a team from Stanford University has used this technology to develop smart glasses that can help people with autism identify the emotions of people they are talking to in real-time.

With the likes of Amazon, Google, Meta and Microsoft investing heavily, emotion recognition technology is likely to become increasingly prevalent.

Car manufacturers are developing new features to identify when drivers are tired, distracted, or suffering from ‘road rage’. Amazon has even patented technology that would enable Alexa to identify when you have a cold and suggest ordering some medicine.

It begs the question – will it be a positive development to have ‘empathetic’ machines capable of reading our feelings? Or is it an invasion of our privacy and, well, a bit creepy?

“I have a bad feeling about this…” 🫣

Imagine arriving at the airport just in time to go through security and board your flight for a family holiday. You’re waiting patiently to get to the front of the queue, but starting to get anxious about how slowly it's moving. Finally, as the boarding call is announced, you empty your pockets into a tray, put your liquids in a clear plastic bag and walk through the scanners. Then, it becomes clear that there is a problem. A pair of security guards approach and instruct you to follow them.

The issue isn’t with your hand luggage but your emotions.

This isn’t a synopsis for an underwhelming Minority Report spin-off but something that is already happening across China. Emotion recognition tech is being deployed in airports and subway stations to identify potential criminals by analysing their mental state.

Police forces in the UK have been testing it out, too. In 2020, Lincolnshire Police announced that it was trialling a new AI-powered tool that would enable them to analyse the mood of people captured by CCTV. It raises the prospect of a future where police forces start using emotional analysis to prevent crimes before they happen.

If we’ve reached the point where emotion recognition technology could soon be used for something as important as law enforcement, you’d think we must now be pretty confident that it actually works.

But the evidence is far from conclusive.

Researchers at the University of Maryland analysed the results of emotion AIs developed by Microsoft and Chinese company Megvii using pictures of black and white basketball players. Writing up her findings in 2018, Lauren Rhue arrived at a worrying conclusion: “Both services interpret black players as having more negative emotions than white players.”

Could this result from bias in the data used to train these systems? Or might it be something even more deep-rooted?

Microsoft abruptly pulled the emotion features from its facial recognition AI in the summer of 2022. Natasha Crampton, the software giant’s Chief Responsible AI Officer, explained that the decision wasn’t just an ethical one:

"[W]e have decided we will not provide open-ended API access to technology that can scan people’s faces and purport to infer their emotional states based on their facial expressions or movements. Experts inside and outside the company have highlighted the lack of scientific consensus on the definition of “emotions,” the challenges in how inferences generalize across use cases, regions, and demographics, and the heightened privacy concerns around this type of capability."

What is an emotion? 🧐

Inside Out was hailed as something of a masterpiece when it was released in 2015. Quite rightly, too. Like all Disney Pixar films, it combines spectacular animation with brilliant storytelling that will make you laugh and cry, whatever your age. But what made it really special was how it brought to life concepts from neuroscience and cognitive psychology.

If you haven’t seen it, you really should (ideally soon – a sequel is coming this summer!). I won’t give any spoilers, but here’s the set-up: Riley is an 11-year-old girl who has been uprooted from Minnesota to San Francisco because of her dad’s new job. Things don’t start well. The new house is cramped, Riley misses her old school, and – to add insult to injury – the local pizza parlour only serves broccoli pizza.

The genius of Inside Out is how it tells this story. The main location isn’t San Francisco but the inside of Riley’s brain, where five core emotions wrestle to control her thoughts and actions as she adapts to her new life.

The film was heavily praised for being rooted in science. The landscape of Riley’s mind was drawn with input from neuroscientists and took visual cues from the appearance of DNA strands and flashing neurons.

Disney and Pixar turned to Professor Paul Ekman, regarded as one of the most influential psychologists of the 20th century, to help develop their portrayal of emotions.

Ekman developed his Basic Emotion Theory in the 1960s. It revolved around the idea that there is a set of universal emotions that all humans share:

Happiness
Sadness
Fear
Disgust
Anger
Surprise

(Inside Out’s writers decided to rebrand ‘happiness’ as ‘joy’ and exclude ‘surprise’ because they felt it overlapped too much with ‘fear’.)

Ekman argued that these core emotions were always expressed and recognised in the same way. He carried out research among a remote tribe in Papua New Guinea that had been cut off from Western culture, and the results of his study seemed pretty conclusive: Wherever you live in the world, and whatever your cultural background, people experience the same emotions and use the same facial expressions to communicate them.

This concept of universal emotions became the accepted norm in the years that followed. It’s as fundamental to the storytelling of Inside Out as it is to the workings of today’s emotion recognition technology.

But what if it turns out to be completely wrong?

In the six decades since Ekman’s Pacific Ocean experiments*, an alternative view has emerged and the consensus has begun to unravel. A growing number of researchers now challenge Ekman’s theory that we are all hardwired with a defined set of emotions and programmed to express them in the same way.

Lisa Feldman Barrett, professor of psychology at Northeastern University in Boston, is one of the leading voices arguing for an alternative to Ekman’s theory. In her TED Talk on the subject, she said:

"I have studied emotions as a scientist for the past 25 years, and in my lab, we have probed human faces by measuring electrical signals that cause your facial muscles to contract to make facial expressions. We have scrutinized the human body in emotion. We have analyzed hundreds of physiology studies involving thousands of test subjects. We've scanned hundreds of brains, and examined every brain imaging study on emotion that has been published in the past 20 years.

"And the results of all of this research are overwhelmingly consistent. It may feel to you like your emotions are hardwired and they just trigger and happen to you, but they don't. You might believe that your brain is prewired with emotion circuits, that you're born with emotion circuits, but you're not."

Professor Feldman Barrett and others argue that our brains construct emotions in the moment through a complex process of integrating our past experience, context and anticipation of future events. Rather than being governed by our emotions (like Inside Out’s Riley), we create our own emotional experiences. Our ability to read other people’s emotions is just guesswork, influenced by factors like our culture, situation, and past experiences.

I don’t think this needs to diminish our enjoyment of Inside Out. After all, we shouldn’t take everything it portrays too literally.

However, the implications for emotion recognition technology – an industry soon to be worth $85 billion – could be profound. Feldman Barrett said:

"Tech companies [...] are spending millions of research dollars to build emotion-detection systems, and they are fundamentally asking the wrong question because they're trying to detect emotions in the face and the body, but emotions aren't in your face and body.

"Physical movements have no intrinsic emotional meaning. We have to make them meaningful. A human or something else has to connect them to the context, and that makes them meaningful. That's how we know that a smile might mean sadness and a cry might mean happiness, and a stoic, still face might mean that you are angrily plotting the demise of your enemy."

Professor Sandra Wachter from the Oxford Internet Institute argues that emotion-reading technology “at its best [has] no proven basis in science and at its worst is absolute pseudoscience.”

Perhaps the most persuasive argument favouring this more complex theory is that – if we are honest – we are often pretty bad at reading other people’s emotions. And if this is a struggle for humans, we should definitely be wary of placing too much trust in machines to do a better job.