This is my second post on the emerging tech of emotion-recognition AI. In my last post, I considered some of the consequences of algorithmic blind spots on likely applications of emotion-recognition tech. In this post, I’ll get into algorithmic bias.
We know, with algorithms derived from machine learning, that the resulting algorithm reflects the biases in the dataset and in the trainers.
For example, object-recognition algorithms don’t do so well at recognizing objects in lower-income or non-Western households: they’re trained on objects found in affluent Western households, in the context of such households, in part because that’s where the majority of images on the English-language Web come from.
Or, more pertinently, we know that face recognition has a problem identifying dark faces and non-male faces. It works best on white, adult, male faces, like with an accuracy rate of 1% or less, because most of the faces it was trained on are white adult males.
But it doesn’t do so well on faces that have darker skin, trans faces, female faces, and so on. In 2018 a researcher named Buolamwini found that popular face-recogntion software got the gender wrong 1% of the time on white male faces, but 30% of the time on dark-skinned female faces.
This is technology that’s in use today, being sold today.
So, it doesn’t take much imagination to extend that to emotion-monitoring software. It’s probably not going to be trained on perfectly balanced datasets. What could be the impacts of technology that identifies emotions accurately for a majority of people, but gives false readings for a minority?
The CanCon panel that sparked these posts was loosely focused on one application that’s currently being marketed, called RealEyes, which claims to use webcams to detect how a virtual focus group feels about ads.
If we’re just using a biased emotion-detection algorithm for advertising, then imagine a world where focus groups only work really well on white men. Suppose the emotion-monitoring software is a little fritzy for women’s faces, and misreads an appalled expression as an excited one. Now just imagine the hilariously bad ads you could see for tampons, pads, toiletpaper.
The software doesn’t even have to be all that fritzy. It’s not hard to imagine emotion-detecting software that has trouble distinguishing between a smirk and a smile for some subset of people. But if you have a product that targets that subset, and you’re putting out the ads that get sarcastic smirks, you’re probably not selling a lot.
More seriously, imagine a world where you just start seeing less and less new products that white men wouldn’t find interesting, because they’re not selling. They’re not selling because the advertising is terrible, but the ad campaigns won’t get the blame because they’re “AI verified”. People put a strange amount of faith in “AI” technology. I realize that a world where more products are targeted to straight white men IS the world we live in to a large degree, especially in the tech area, but now imagine it dialed up to 11.
Let’s imagine that the technology is used in a multicultural setting, say for example Canada. The training dataset is going to have cultural biases. To take the smallest example, in most western cultures nodding means yes and shaking the head means no. So videos of people nodding their head vehemently, you would train the AI to recognize as vehement agreement. But in Balkan countries, particularly in Bulgaria, this is reversed. Nod means no and shake means yes. So imagine the AI has been trained on a Western European dataset, and now it’s watching a Bulgarian-Canadian watching a political ad. The Bulgarian-Canadian hates the ad and they keep nodding and nodding, but the AI registers how much they love the ad.
There are also cultural factors around the expression of emotions. How much do we smile, how much do we laugh? Do we like to complain or to minimise our problems?
In my next few posts, I’ll look at some more possible consequences of using impefect emotion-detection algorithms in the real world. Next up: assistive technologies.
UPDATE: I wrote this post in late May 2020, a couple of days before George Floyd was killed by police. I’d planned a subsequent post discussing the possible horrible consequences of racially-biased emotion-recognition algorithms in policing and in the new field of “predictive policing”. I no longer believe that post would add anything to the discourse – no one’s unaware of the problem of police racial bias post-George-Floyd, whether or not they choose to acknowledge it, or of its disastrous consequences. Adding tech based on racially-biased algorithms could only compound the problem, in predictably awful ways.