Facebook 发起 Deepfake 检测挑战赛
Facebook is making its own deepfakes and offering prizes for detecting them
2019-09-17 09:50

Image and video manipulation powered by deep learning, or so-called “deepfakes,” represent a strange and horrifying facet of a promising new field. If we’re going to crack down on these creepy creations, we’ll need to fight fire with fire; Facebook, Microsoft, and many others are banding together to help make machine learning capable of detecting deepfakes — and they want you to help.

深度学习是一个充满希望的新领域,而由深度学习或所谓 “Deepfake” 驱动的图像和视频篡改则代表着该领域奇怪和可怕的一面。如果我们要打击这些令人毛骨悚然的创作,我们将需要以火攻火,以毒攻毒;Facebook、微软和很多其他公司正在联合起来,以帮助开发能够检测 Deepfake 的机器学习技术,而他们希望你也能参与进来。

Though the phenomenon is still new, we are nevertheless in an arms race where the methods of detection vie with the methods of creation. Ever more convincing fakes appear regularly, and though while they are frequently benign, the possibility of having your face flawlessly grafted into a compromising position is very much there — and many a celebrity has already had it done to them.

尽管这仍然是一种新现象,但我们可以说正处在一场检测方法对抗生成方法的军备竞赛之中。现在经常都有 Deepfake 的内容出现,它们看上去越来越逼真,尽管这些内容大多数时候无伤大雅,但你的面孔被人嫁接到性 爱影像中的可能性是确实存在的,而很多名人已经不幸沦陷了。

Facebook, as part of a coalition with Microsoft, the Partnership for AI, and several universities including Oxford, Berkeley, and MIT, is working to empower the side of good with better detection techniques.

Facebook、微软以及多所大学院校(包括牛津大学、加州大学伯克利分校以及麻省理工学院)共同组成了技术行业联盟 Partnership On AI。作为这个联盟的一份子,Facebook 正在致力于通过更好的检测技术来打击对 Deepfake 的滥用。

“The most interesting advances in AI have happened when there’s a clear benchmark on a dataset to write papers against,” said Facebook CTO Mike Schroepfer in a media call yesterday. The dataset for object recognition might be millions of images of ordinary objects, while the dataset for voice transcription would be hours of different kinds of speech. But there’s no such set for deepfakes.

Facebook 首席技术官迈克·斯科洛普夫(Mike Schroepfer)在周三的媒体电话会议上表示:“人工智能领域最有趣的进展发生在数据集有明确基准时,可供研究人员进行探究。” 用于物体识别的数据集可能是数百万张普通物体的图像,用于语音转录的数据集则可能是数小时不同类型的语音,但我们没有这样的数据集用于检测 Deepfake。

We talked about this challenge at our Robotics and AI event earlier this year in what I thought was a very interesting discussion:

在今年早些时候 TechCrunch 举办的机器人和人工智能活动上,我曾与嘉宾讨论过这个挑战,视频如下:

Fortunately Facebook is planning on dedicating around $10 million in resources to make this Deepfake Detection Challenge happen.

幸运的是,Facebook 计划投入约 1000 万美元的资源来发起 Deepfake 检测挑战赛(Deepfake Detection Challenge)。

“Creation of these datasets can be challenging, because you want to make sure that everyone participating in it is clear and gives consent so they aren’t surprised by the usage of it,” Schroepfer continued. And since most deepfakes are made without any consent whatsoever, they’re not really permissible for usage in an academic context.

“创建这些数据集可能具有挑战,因为我们需要确保参与其中的所有人都清楚这件事并给予同意,这样他们就不会对数据集被使用感到意外。” 斯科洛普夫继续道。而由于大多数 Deepfake 内容都是在未得到任何知情同意情况下制作的,所以把它们用于研究是非常不妥的。

So Facebook and its partners are making the deepfake content out of whole cloth, he said. “You want a dataset of source video, and then a dataset of personalities you can map onto that. Then we’re spending engineering time implementing the latest most advanced deepfake techniques to generate altered videos as part of the dataset.”

因此,Facebook 及其合作伙伴开始在凭空制作一些 Deepfake 内容。斯科洛普夫表示:“我们需要一个源视频数据集,然后是一个可以映射到其上的个人数据集。然后,我们的工程人员花费时间利用最新最先进的 Deepfake 技术来生成经过篡改的视频,构成数据集的一部分。”

And while you’re entirely justified in wondering, no, they aren’t using Facebook data to do this. They’ve got paid actors.

虽然你完全有理由认为 Facebook 会利用自己平台的数据来做这件事,但事实并非如此,他们花钱请来了演员。


This dataset will be provided to interested parties, who will be able to build solutions and test them, putting the results on a leaderboard. At some point there will be cash prizes given out, though the details are a ways off. With luck this will spur serious competition among academics and researchers.

这个数据集将提供给感兴趣的各方,他们将能构建解决方案并对其进行测试,并将结果放在一个排行榜上。到了某个时候,Facebook 将发放现金奖励,但具体细节现在还不清楚。如果一切顺利的话,这将在学者和研究人员之间激发激烈的竞争。

“We need the full involvement of the research community in an open environment to develop methods and systems that can detect and mitigate the ill-effects of manipulated multimedia,” said the University of Maryland’s Rama Chellappa in a news release. “By making available a large corpus of genuine and manipulated media, the proposed challenge will excite and enable the research community to collectively address this looming crisis.​”

“我们需要研究界在一个开放的环境中充分参与,以开发能够检测和缓解多媒体内容篡改不良影响的方法和系统。” 马里兰大学的拉玛·切拉帕(Rama Chellappa)在一份新闻稿中表示,“通过开放一个包含真实媒体和篡改媒体的大型数据库,这项挑战赛将激发并为研究界赋能,让他们共同应对这一迫在眉睫的危机。”

Initial tests of the dataset are planned for the International Conference on Computer Vision in October, with the full launch happening at NeurIPS in December.

Facebook 计划在今年 10 月的国际计算机视觉大会(ICCV)上对数据集进行初步测试,而数据集正式上线则定于今年 12 月的神经信息处理系统大会(NeurIPS)。

0 条评论