现在不要看:为什么你该担心机器能够读取你的情绪 | 科技
Don’t look now: why you should be worried about machines reading your emotions | Technology
2188字
2019-05-03 19:53
76阅读
火星译客

Could a program detect potential terrorists by reading their facial expressions and behavior? This was the hypothesis put to the test by the US Transportation Security Administration (TSA) in 2003, as it began testing a new surveillance program called the Screening of Passengers by Observation Techniques program, or Spot for short.

程序可以通过读取他们的面部表情和行为来发现潜在的恐/怖分子吗?2003年,美国交通安全管理局开始测试一个名为“以观察技术筛查乘客”的新监控项目,对这一假设进行了验证。

While developing the program, they consulted Paul Ekman, emeritus professor of psychology at the University of California, San Francisco. Decades earlier, Ekman had developed a method to identify minute facial expressions and map them on to corresponding emotions. This method was used to train “behavior detection officers” to scan faces for signs of deception.

在开发这个项目时,他们咨询了加州大学旧金山分校心理学名誉教授保罗·埃克曼。几十年前,埃克曼发明了一种识别细微的面部表情并将其与相应情绪对应上的方法。这种方法被用于训练“行为侦测人员”来扫描面部识别撒谎迹象。

But when the program was rolled out in 2007, it was beset with problems. Officers were referring passengers for interrogation more or less at random, and the small number of arrests that came about were on charges unrelated to terrorism. Even more concerning was the fact that the program was allegedly used to justify racial profiling.

但当该计划在2007年推出时,却遇到了诸多问题。警官们或多或少地对乘客们进行随机审讯,而少数被捕者的指控与恐怖主义无关。更令人担忧的是,该项目据称被用来为种族定性辩护。

Ekman tried to distance himself from Spot, claiming his method was being misapplied. But others suggested that the program’s failure was due to an outdated scientific theory that underpinned Ekman’s method; namely, that emotions can be deduced objectively through analysis of the face.

埃克曼声称他的方法被误用了,并试图与该监控项目撇清关系。但其他人认为,该计划的失败是由于支持埃克曼方法的科学理论已经过时了;也就是说,人的情感可以通过分析面部来客观推断。

In recent years, technology companies have started using Ekman’s method to train algorithms to detect emotion from facial expressions. Some developers claim that automatic emotion detection systems will not only be better than humans at discovering true emotions by analyzing the face, but that these algorithms will become attuned to our innermost feelings, vastly improving interaction with our devices.

近年来,科技公司开始使用埃克曼的方法设定算法从面部表情中识别出情绪。一些开发人员声称,自动情绪检测系统不仅能比人类更好地通过分析人脸来探测真实的情绪,而且这些算法也将与我们内心的感觉相协调,极大地改善与设备间的互动。

But many experts studying the science of emotion are concerned that these algorithms will fail once again, making high-stakes decisions about our lives based on faulty science.

但许多研究情绪科学的专家担心,这些算法将再次失败,根据错误的科学结论对我们的生活做出高风险的决定。

Your face: a $20bn industry

你的脸: 一个价值200亿美元的产业

Emotion detection technology requires two techniques: computer vision, to precisely identify facial expressions, and machine learning algorithms to analyze and interpret the emotional content of those facial features.

情感检测技术需要计算机视觉和机器学习两种技术,计算机视觉用于精确识别面部表情,而机器学习算法用来分析和解释这些面部特征的情感内涵。

Typically, the second step employs a technique called supervised learning, a process by which an algorithm is trained to recognize things it has seen before. The basic idea is that if you show the algorithm thousands and thousands of images of happy faces with the label “happy” when it sees a new picture of a happy face, it will, again, identify it as “happy”.

通常,第二步使用一种被称为监督学习的技术,通过这个过程训练算法识别以及见过的东西。其基本思想是,如果你向算法展示成千上万张带有“快乐”标签的笑脸图片,当它看到一张新的笑脸图片时,就会再次将其识别为“快乐”。

A graduate student, Rana el Kaliouby, was one of the first people to start experimenting with this approach. In 2001, after moving from Egypt to Cambridge University to undertake a PhD in computer science, she found that she was spending more time with her computer than with other people. She figured that if she could teach the computer to recognize and react to her emotional state, her time spent far away from family and friends would be less lonely.

研究生拉纳·卡利欧比是第一批开始尝试这种方法的人之一。2001年,在从埃及搬到剑桥大学攻读计算机科学博士学位后,她发现自己花在电脑上的时间远多于其他人身上的还要多。她想,如果她能教会电脑识别并对自己的情绪状态做出反应,那么远离家人和朋友的日子就不会那么孤单了。

Kaliouby dedicated the rest of her doctoral studies to work on this problem, eventually developing a device that assisted children with Asperger syndrome read and respond to facial expressions. She called it the “emotional hearing aid”.

卡利欧比把她余下的博士研究都集中到这个问题上,最终开发出一种设备,可以帮助患有阿斯伯格综合症的儿童阅读并对面部表情做出反应。她称之为“情感助听器”。

In 2006, Kaliouby joined the Affective Computing lab at the Massachusetts Institute of Technology, where together with the lab’s director, Rosalind Picard, she continued to improve and refine the technology. Then, in 2009, they co-founded a startup called Affectiva, the first business to market “artificial emotional intelligence”.

2006年,卡利欧比加入了麻省理工学院的情感计算实验室,在那里她与实验室主任罗莎琳德·皮卡德一起,继续改进和完善这项技术。然后,在2009年,他们共同创立了一家名为“情感天地”的初创公司,这是第一家营销“情感人工智能”的公司。

At first, Affectiva sold their emotion detection technology as a market research product, offering real-time emotional reactions to ads and products. They landed clients such as Mars, Kellogg’s and CBS. Picard left Affectiva in 2013 and became involved in a different biometrics startup, but the business continued to grow, as did the industry around it.

起初,情感天地将他们的情感检测技术作为市场研究产品出售,为广告和产品提供实时情感反应。他们争取到了玛氏、凯洛格和哥伦比亚广播公司等客户。皮卡德在2013年离开了情感天地,并加入了一家识别不同生物初创公司,但该公司的业务继续增长,围绕它的行业也是如此。

Amazon, Microsoft and IBM now advertise “emotion analysis” as one of their facial recognition products, and a number of smaller firms, such as Kairos and Eyeris, have cropped up, offering similar services to Affectiva.

亚马逊、微软和美国国际商用机器公司现在都将“情感分析”作为他们的面部识别产品之一进行广告宣传,而一些规模较小的公司如凯洛和Eyeris,也纷纷涌现,向情感天地寻求类似服务。

Beyond market research, emotion detection technology is now being used to monitor and detect driver impairment, test user experience for video games and to help medical professionals assess the wellbeing of patients.

除了市场调查,情绪检测技术现在还被用来监测和检测驾驶员的损伤,测试视频游戏的用户体验以及帮助医疗专业人员评估患者健康状况。

Kaliouby, who has watched emotion detection grow from a research project into a $20bn industry, feels confident that this growth will continue. She predicts a time in the not too distant future when this technology will be ubiquitous and integrated in all of our devices, able to “tap into our visceral, subconscious, moment by moment responses”.

卡利欧比目睹了情绪检测从一个研究项目发展成为一个价值200亿美元的产业,他对这种增长将会继续存在充满信心。她预测,在不久的将来,这项技术将无处不在,并融入我们所有的设备中中,能够“进入我们的内心、潜意识和实时反应”。

A database of 7.5m faces from 87 countries

一个包含87个国家750万张面孔的数据库。

As with most machine learning applications, progress in emotion detection depends on accessing more high-quality data.

与大多数机器学习应用程序一样,情绪检测的发展依赖于访问更多高质量的数据。

According to Affectiva’s website, they have the largest emotion data repository in the world, with over 7.5m faces from 87 countries, most of it collected from opt-in recordings of people watching TV or driving their daily commute.

根据情感数据网站的数据,他们拥有世界上最大的情感数据存储库,来自87个国家的750多万张脸,其中大部分是人们在看电视或开车上下班时自愿贡献的记录。

These videos are sorted through by 35 labelers based in Affectiva’s office in Cairo, who watch the footage and translate facial expressions to corresponding emotions – if they see lowered brows, tight-pressed lips and bulging eyes, for instance, they attach the label “anger”. This labeled data set of human emotions is then used to train Affectiva’s algorithm, which learns how to associate scowling faces with anger, smiling faces with happiness, and so on.

这些视频由位于开罗的情感天地办公室的35名贴标人进行分类,他们观看视频,将面部表情转化为相应的情绪——例如,如果他们看到眉毛低垂、嘴唇紧闭、眼睛凸出,他们就会贴上“愤怒”的标签。然后,这些标记人类情感的数据集被用来训练情感天地的算法,该算法学习如何将愁眉不展的脸与愤怒联系起来,将微笑的脸与幸福联系起来,等等。

A face with lowered brows and tight-pressed lips meant 'anger' to a banker in the US and to a hunter in Papua New Guinea

对于美国的银行家和巴布亚新几内亚的猎人来说,一张低眉抿唇的脸意味着“愤怒”。

This labelling method, which is considered by many in the emotion detection industry to be the gold standard for measuring emotion, is derived from a system called the Emotion Facial Action Coding System (Emfacs) that Paul Ekman and Wallace V Friesen and developed during the 1980s.

这种标记方法被情感检测行业的许多人认为是测量情绪的黄金标准,是由保罗·埃克曼和华莱士·V·弗里森在20世纪80年代开发的一个名为“情绪面部动作编码系统”衍生而来的。

The scientific roots of this system can be traced back to the 1960s, when Ekman and two colleagues hypothesized that there are six universal emotions – anger, disgust, fear, happiness, sadness and surprise – that are hardwired into us and can be detected across all cultures by analyzing muscle movements in the face.

这个系统的科学依据可以追溯到20世纪60年代,当时埃克曼和两位同事假设存在六种普遍的情绪——愤怒、厌恶、恐惧、快乐、悲伤和惊讶——这些情绪与生俱来,在任何文化环境中都可以借助分析面部肌肉的变化被识别。

To test the hypothesis, they showed diverse population groups around the world photographs of faces, asking them to identify what emotion they saw. They found that despite enormous cultural differences, humans would match the same facial expressions with the same emotions. A face with lowered brows, tight-pressed lips and bulging eyes meant “anger” to a banker in the United States and a semi-nomadic hunter in Papua New Guinea.

为了验证这一假设,他们向世界各地的不同人群展示了人脸照片,要求他们识别自己看到的情绪。他们发现,尽管存在巨大的文化差异,但人类会将相同的面部表情与相同的情感匹配起来。对美国的银行家和巴布亚新几内亚的半游牧猎人来说,眉毛低、嘴唇紧闭、眼睛凸出的脸都意味着“愤怒”。

Over the next two decades, Ekman drew on his findings to develop his method for identifying facial features and mapping them to emotions. The underlying premise was that if a universal emotion was triggered in a person, then an associated facial movement would automatically show up on the face. Even if that person tried to mask their emotion, the true, instinctive feeling would “leak through”, and could therefore be perceived by someone who knew what to look for.

在接下来的20年里,埃克曼利用他的发现来开发识别面部特的技术以及将其与情绪联系起来的方法。潜在前提是,如果一个人的普遍情绪被触发,那么相关的面部运动就会自动出现在脸上。即使那个人试图掩饰自己的情绪,真实的、本能的感觉也会“流露”出来,因此,那些知道寻找什么的人就能察觉到。

Throughout the second half of the 20th century, this theory – referred to as the classical theory of emotions – came to dominate the science of emotions. Ekman made his emotion detection method proprietary and began selling it as a training program to the CIA, FBI, Customs and Border Protection and the TSA. The idea of true emotions being readable on the face even seeped into popular culture, forming the basis of the show Lie to Me.

整个20世纪下半叶,这一理论——被称为经典的情绪理论——开始主导情绪科学。埃克曼将自己的情绪检测方法变成了专利,并开始将其作为培训项目出售给美国中央情报局、联邦调查局、海关和边境保护局以及运输安全管理局。面部表情传达真实情感的理念甚至渗透到了流行文化中,成为《别对我撒谎》的基础。

And yet, many scientists and psychologists researching the nature of emotion have questioned the classical theory and Ekman’s associated emotion detection methods.

然而,许多研究情绪本质的科学家和心理学家对经典理论和埃克曼相关的情绪检测方法提出了质疑。

In recent years, a particularly powerful and persistent critique has been put forward by Lisa Feldman Barrett, professor of psychology at Northeastern University.

近年来,美国东北大学心理学教授莉莎·费尔德曼·巴雷特提出了一个特别有力且持续不断的批评。

Barrett first came across the classical theory as a graduate student. She needed a method to measure emotion objectively and came across Ekman’s methods. On reviewing the literature, she began to worry that the underlying research methodology was flawed – specifically, she thought that by providing people with preselected emotion labels to match to photographs, Ekman had unintentionally “primed” them to give certain answers.

巴雷特第一次接触古典理论是在他读研究生的时候。她需要一种客观地衡量情绪的方法,偶然发现了埃克曼的方法。在回顾这些文献时,她开始担心潜在的研究方法存在缺陷——具体来说,她认为,通过向人们提供与照片相匹配的预先选择的情感标签,埃克曼无意中向他们“准备”了某些答案。

She and a group of colleagues tested the hypothesis by re-running Ekman’s tests without providing labels, allowing subjects to freely describe the emotion in the image as they saw it. The correlation between specific facial expressions and specific emotions plummeted.

她和一组同事在不提供标签的情况下重新运行Ekman的测试以此验证假设,让受试者在他们看到图像时自由地描述图像中的情绪。特定面部表情和特定情绪之间的相关性急剧下降。

Since then, Barrett has developed her own theory of emotions, which is laid out in her book How Emotions Are Made: the Secret Life of the Brain. She argues there are no universal emotions located in the brain that are triggered by external stimuli. Rather, each experience of emotion is constructed out of more basic parts.

从那以后,巴雷特发展了自己的情绪理论,并在她的《情绪是如何形成的:大脑的秘密生活》一书中进行了阐述。她认为,大脑中没有由外部刺激触发的普遍情绪。相反,每种情感体验都是由更基本的部分构成的。

“They emerge as a combination of the physical properties of your body, a flexible brain that wires itself to whatever environment it develops in, and your culture and upbringing, which provide that environment,” she writes. “Emotions are real, but not in the objective sense that molecules or neurons are real. They are real in the same sense that money is real – that is, hardly an illusion, but a product of human agreement.”

她写道:“它们是由你身体的物理特性、将自身连接到它所发展的任何环境的灵活大脑、以及提供那种环境的文化和教养共同组合起来的。”“情绪是真实的,但并不像分子或神经元那样是客观存在的。它们是真实的,就像钱是真实的一样——这不是幻觉,而是人类共识的产物。”

Barrett explains that it doesn’t make sense to talk of mapping facial expressions directly on to emotions across all cultures and contexts. While one person might scowl when they’re angry, another might smile politely while plotting their enemy’s downfall. For this reason, assessing emotion is best understood as a dynamic practice that involves automatic cognitive processes, person-to-person interactions, embodied experiences, and cultural competency. “That sounds like a lot of work, and it is,” she says. “Emotions are complicated.”

巴雷特解释说,把面部表情直接映射到所有文化和环境中的情绪是没有意义的。虽然一个人在生气时可能会皱眉,但另一个人可能会在策划敌人的垮台时礼貌地微笑。因此,评估情绪最好被理解为一种动态实践,涉及自动认知过程、人与人之间的互动、具体体验和文化能力。“这听起来需要做很多工作,而且确实如此,”她说。 “情绪很复杂。”

Kaliouby agrees – emotions are complex, which is why she and her team at Affectiva are constantly trying to improve the richness and complexity of their data. As well as using video instead of still images to train their algorithms, they are experimenting with capturing more contextual data, such as voice, gait and tiny changes in the face that take place beyond human perception. She is confident that better data will mean more accurate results. Some studies even claim that machines are already outperforming humans in emotion detection.

Kaliouby同意这一观点——情绪是复杂的,这就是为什么她和她在Affectiva的团队一直努力提高他们数据的丰富性和复杂性。除了使用视频而不是静态图像来训练算法之外,他们还在尝试捕捉更多的上下文数据,比如声音、步态和超出人类感知的脸部细微变化。她相信更好的数据意味着更准确的结果。一些研究甚至声称机器在情感检测方面已经超越了人类。

But according to Barrett, it’s not only about data, but how data is labeled. The labelling process that Affectiva and other emotion detection companies use to train algorithms can only identify what Barrett calls “emotional stereotypes”, which are like emojis, symbols that fit a well-known theme of emotion within our culture.

但是根据巴雷特的说法,这不仅仅是数据,而是如何标记数据。Affectiva和其他情绪检测公司用来训练算法的标记过程只能识别出巴雷特所说的“情感刻板印象”,就像符合我们文化中众所周知的情感主题的表情符号一样。

According to Meredith Whittaker, co-director of the New York University-based research institute AI Now, building machine learning applications based on Ekman’s outdated science is not just bad practice, it translates to real social harms.

纽约大学人工智能研究所(New York university research institute AI Now)联合所长梅雷迪思•惠特克(Meredith Whittaker)表示,基于埃克曼过时的科学理论开发机器学习应用程序不仅是一种糟糕的做法,还会带来真实的社会危害。

“You’re already seeing recruitment companies using these techniques to gauge whether a candidate is a good hire or not. You’re also seeing experimental techniques being proposed in school environments to see whether a student is engaged or bored or angry in class,” she says. “This information could be used in ways that stop people from getting jobs or shape how they are treated and assessed at school, and if the analysis isn’t extremely accurate, that’s a concrete material harm.”

“你已经看到招聘公司使用这些技术来衡量应聘者是否是一个好员工。你也可以看到在学校环境中推出实验性技术,以观察学生在课堂上是否投入、厌倦还是愤怒,”她说。“这些信息可能会以阻碍人们找到工作,或者影响他们在学校被对待和评估的方式在使用,如果分析不是非常准确,那将会造成实质性的伤害。”

Kaliouby says that she is aware of the ways that emotion detection can be misused and takes the ethics of her work seriously. “Having a dialogue with the public around how this all works and where to apply and where not to apply it is critical,” she told me.

Kaliouby说,她意识到情绪检测可能会被滥用,并认真对待她的工作伦理。她告诉我:“与公众就这一切是如何运作的、在哪里适用、在哪里不适用展开对话是至关重要的。”

Having worn a headscarf in the past, Kaliouby is also keenly aware of the importance of building diverse data sets. “We make sure that when we train any of these algorithms the training data is diverse,” she says. “We need representation of Caucasians, Asians, darker skin tones, even people wearing the hijab.”

Kaliouby过去戴过头巾,也敏锐地意识到构建多样化数据集的重要性。她说:“我们确保在训练这些算法时,训练数据是多样化的。”“我们需要白人、亚洲人、深色肤色的代表,甚至是戴希贾布的人。”

This is why Affectiva collects data from 87 countries. Through this process, they have noticed that in different countries, emotional expression seems to take on different intensities and nuances. Brazilians, for example, use broad and long smiles to convey happiness, Kaliouby says, while in Japan there is a smile that does not indicate happiness, but politeness.

这就是为什么Affectiva从87个国家收集数据的原因。通过这个过程,他们注意到在不同的国家,情感表达似乎呈现出不同的强度和细微差别。例如,Kaliouby说,巴西人用宽阔而悠长的微笑来传达快乐,而在日本,微笑并不表示快乐,而是表示礼貌。

Affectiva have accounted for this cultural nuance by adding another layer of analysis to the system, compiling what Kaliouby calls “ethnically based benchmarks”, or codified assumptions about how an emotion is expressed within different ethnic cultures.

Affectiva通过在系统中添加另一层分析来解释这种文化差异,编写Kaliouby所称的“基于种族的基准”,Kaliouby所谓的“基于种族的基准”,或者编纂关于情绪如何在不同种族文化中表达的假设。

But it is precisely this type of algorithmic judgment based on markers like ethnicity that worries Whittaker most about emotion detection technology, suggesting a future of automated physiognomy. In fact, there are already companies offering predictions for how likely someone is to become a terrorist or pedophile, as well as researchers claiming to have algorithms that can detect sexuality from the face alone.

惠特克对情绪检测技术最担心的正是这种基于种族标记的算法判断,这预示着一个自动相面术的未来。事实上,已经有一些公司提供了关于一个人成为恐/怖/分子或恋童癖者的可能性的预测,还有一些研究人员声称,他们拥有一种可以仅从面部识别性行为的算法。

Several studies have also recently shown that facial recognition technologies reproduce biases that are more likely to harm minority communities. One published in December last year shows that emotion detection technology assigns more negative emotions to black men’s faces than white counterparts.

最近的几项研究也表明,面部识别技术会重现更有可能伤害少数族裔群体的偏见。去年12月发表的一项研究表明,情感检测技术给黑人男性带来的负面情绪要多于白人男性。

When I brought up these concerns with Kaliouby she told me that Affectiva’s system does have an “ethnicity classifier”, but that they are not using it right now. Instead, they use geography as a proxy for identifying where someone is from. This means they compare Brazilian smiles against Brazilian smiles, and Japanese smiles against Japanese smiles.

当我向Kaliouby提出这些问题时,她告诉我,Affectiva的系统确实有一个“种族分类器”,但他们现在不用它。相反,他们使用地理位置识别某人来自何处。这意味着将巴西人的微笑和巴西人的微笑进行比较,日本人的微笑和日本人的微笑进行对比。

“What about if there was a Japanese person in Brazil,” I asked. “Wouldn’t the system think they were as Brazilian and miss the nuance of the politeness smile?”

“如果巴西有个日本人呢?”“难道系统不会认为他们和巴西人一样,错过了礼貌微笑的细微差别吗?”

“At this stage,” she conceded, “the technology is not 100% foolproof.”

“在这个阶段,”她承认,“这项技术并非100%万无一失。”

0 条评论
评论不能为空