您现在的位置: 纽约时报中英文网 >> 纽约时报中英文版 >> 科技 >> 正文

人工智能缺陷与误觉:让机器产生幻觉的“怪异事件”

更新时间:2019/1/22 16:18:15 来源:纽约时报中文网 作者:佚名

The 'weird events' that make machines hallucinate
人工智能缺陷与误觉:让机器产生幻觉的“怪异事件”

The passenger registers the stop sign and feels a sudden surge of panic as the car he’s sitting in speeds up. He opens his mouth to shout to the driver in the front, remembering – as he spots the train tearing towards them on the tracks ahead – that there is none. The train hits at 125mph, crushing the autonomous vehicle and instantly killing its occupant.

乘客看到了停车标志,突然感到一阵恐慌,因为他坐的汽车反而开始加速。当他看到前面的铁轨上一列火车向他们疾驰而来时,他张开嘴对前面的司机大声喊叫,但他突然意识到汽车前坐并没有司机。列车以每小时125英里的速度撞上来,压碎了这辆自动驾驶汽车,乘客当场死亡。

This scenario is fictitious, but it highlights a very real flaw in current artificial intelligence frameworks. Over the past few years, there have been mounting examples of machines that can be made to see or hear things that aren’t there. By introducing ‘noise’ that scrambles their recognition systems, these machines can be made to hallucinate. In a worst-case scenario, they could ‘hallucinate’ a scenario as dangerous as the one above, despite the stop sign being clearly visible to human eyes, the machine fails to recognise it.

这个场景是虚构的,但是凸显了当前人工智能框架中一个非常真实的缺陷。在过去的几年里,已经有越来越多的例子表明,机器可以被误导,看见或听见根本不存在的东西。如果出现“噪音”会干扰到人工智能的识别系统,就可能产生误觉。在最坏的情况下,他们可能会因“幻觉”导致上面一样危险的场景,尽管停车标志在人眼中清晰可见,但机器却未能识别出来。

Those working in AI describe such glitches as ‘adversarial examples’ or sometimes, more simply, as ‘weird events’.

人工智能领域工作者将这些小故障描述为“对抗性的例子”,或者有时更简单地说是“怪异事件”。

“We can think of them as inputs that we expect the network to process in one way, but the machine does something unexpected upon seeing that input,” says Anish Athalye, a computer scientist at Massachusetts Institute of Technology in Cambridge.

美国麻省理工学院(Massachusetts Institute of Technology)的计算机科学家阿塔利(Anish Athalye)表示:“我们可以把这些东西看作是人工智能网络会以某种方式处理的输入信息,但机器在看到这些输入信息后会做出一些意想不到的反应。”

Seeing things

看物体

So far, most of the attention has been on visual recognition systems. Athalye himself has shown it is possible to tamper with an image of a cat so that it looks normal to our eyes but is misinterpreted as guacamole by so-called called neural networks – the machine-learning algorithms that are driving much of modern AI technology. These sorts of visual recognition systems are already being used to underpin your smartphone’s ability to tag photos of your friends without being told who they are or to identify other objects in the images on your phone.

到目前为止,人们主要关注的是视觉识别系统。阿塔利自己已经证明,将一张猫的图像稍加改动,人眼看来仍是一只标准的猫,却被所谓的神经网络误解为是鳄梨酱。神经网络是一种机器学习算法,极大地推动了现代人工智能技术的发展。这类神经网络视觉识别系统已经被用来增强你的智能手机的能力,让手机在不被告知是谁的情况下对你的朋友照片进行身份标记,或者识别你手机照片中的其他物体。

More recently, Athalye and his colleagues turned their attention to physical objects. By slightly tweaking the texture and colouring of these, the team could fool the AI into thinking they were something else. In one case a baseball that was misclassified as an espresso and in another a 3D-printed turtle was mistaken for a rifle. They were able to produce some 200 other examples of 3D-printed objects that tricked the computer in similar ways. As we begin to put robots in our homes, autonomous drones in our skies and self-driving vehicles on our streets, it starts to throw up some worrying possibilities.

最近,阿塔利和他的同事们把注意力转向了实际物体。发现只要稍微调整一下它们的纹理和颜色,他的团队就可以骗过人工智能,把这些物体认作别的东西。在一个案例中,棒球被误认为是一杯浓缩咖啡,而在另一个案例中,3D打印的海龟被误认为是步枪。还有其他例子,他们制造了约200个3D打印物体,这些物体以类似的方式欺骗了电脑。今天当我们开始在家里使用机器人、在空中运用自动驾驶无人机、在街道上行驶自动驾驶汽车时,机器人的这种误觉开始抛出一些令人担忧的可能性。

“At first this started off as a curiosity,” says Athalye. “Now, however, people are looking at it as a potential security issue as these systems are increasingly being deployed in the real world.”

阿塔利说,“起初,这只是一种好奇,然而,随着这些智能系统越来越多地部署在现实世界中,人们正将其视为一个潜在的安全问题。”

Take driverless cars which are currently undergoing field trials: these often rely on sophisticated deep learning neural networks to navigate and tell them what to do.

以目前正在进行实地试验的无人驾驶汽车为例:这些汽车通常依靠复杂的深度学习神经网络导航,并告诉它们该做什么。

But last year, researchers demonstrated that neural networks could be tricked into misreading road ‘Stop’ signs as speed limit signs, simply through the placement of small stickers on the sign.

但在去年,研究人员证明,仅仅只在路标上粘一两张小贴纸,神经网络就可能受骗,将道路上的“停车”标志误认为限速标志。

Hearing voices

听声音

Neural networks aren’t the only machine learning frameworks in use, but the others also appear vulnerable to these weird events. And they aren’t limited to visual recognition systems.

神经网络并不是唯一使用的机器学习框架,但其他的人工智能框架似乎也容易遭受这些怪异事件的影响。并且不限于视觉识别系统。

“On every domain I've seen, from image classification to automatic speech recognition to translation, neural networks can be attacked to mis-classify inputs,” says Nicholas Carlini, a research scientist at Google Brain, which is developing intelligent machines. Carlini has shown how – with the addition of what sounds like a bit of scratchy background noise – a voice reading “without the dataset the article is useless” can be mistranslated as “Ok Google browse to evil dot com”. And it is not just limited to speech. In another example, an excerpt from Bach’s Cello Suit 1 transcribed as “speech can be embedded in music”.

谷歌大脑(Google Brain)正在研发智能机器。谷歌大脑的研究科学家卡里尼(Nicholas Carlini)说,“在我见过的每一个领域,从图像分类到自动语音识别,再到翻译,神经网络都可能受到攻击,导致输入信号被错误分类。”卡里尼作了展示,加上一些摩擦的背景噪音后,“没有数据集的文章是无用的”这句话的读音,机器会误译为“好,谷歌要浏览evil.com”。而且它不仅限于语音讲话。在另一个例子中,巴赫(Bach)的第一号无伴奏大提琴组曲(Cello Suit 1)中的一段音乐节选被记录为“语言可以嵌入音乐”。

To Carlini, such adversarial examples “conclusively prove that machine learning has not yet reached human ability even on very simple tasks”.

在卡里尼看来,这些对抗性的例子“最终证明,哪怕在非常简单的任务上,机器学习也没有达到人类的能力”。

Under the skin

内在原理

Neural networks are loosely based on how the brain processes visual information and learns from it. Imagine a young child learning what a cat is: as they encounter more and more of these creatures, they will start noticing patterns – that this blob called a cat has four legs, soft fur, two pointy ears, almond shaped eyes and a long fluffy tail. Inside the child’s visual cortex (the section of the brain that processes visual information), there are successive layers of neurons that fire in response to visual details, such as horizontal and vertical lines, enabling the child to construct a neural ‘picture’ of the world and learn from it.

人工神经网络是大致模仿大脑(即生物神经网络)处理视觉信息的功能并从中学习方法。想象一个小孩正在学习认识猫是什么东西:当他们见到这种动物的次数越来越多时,就会开始注意到这种动物的一些固定模式,发现这团叫做猫的东西有四条腿,有柔软的皮毛、两只尖耳朵、杏仁状的眼睛和一条毛茸茸的长尾巴。在儿童的视觉皮层(大脑中处理视觉信息的区域)内,多层神经元会对视觉细节做出反应,如水平和垂直的线条,使儿童能够构建一幅世界的神经“图画”,并从中学习视觉识别。

Neural networks work in a similar way. Data flows through successive layers of artificial neurons until after being trained on hundreds or thousands of examples of the same thing (usually labelled by a human), the network starts to spot patterns which enable it to predict what it is viewing. The most sophisticated of these systems employ ‘deep-learning’ which means they possess more of these layers.

神经网络的工作原理与此类似,获取的数据通过多层人工神经元网络传输进行信息处理,在接受到成百上千个相同物体的样本(通常由人类标记)的训练之后,神经网络开始建立此物体的视觉识别模式,从而能够在其后认得出正在观看的东西是这种物体。其中最复杂的系统采用“深度学习”,这意味着需要拥有更多的信息处理层。

However, although computer scientists understand the nuts and bolts of how neural networks work, they don’t necessarily know the fine details of what’s happening when they crunch data. “We don't currently understand them well enough to, for example, explain exactly why the phenomenon of adversarial examples exists and know how to fix it,” says Athalye.

然而,尽管计算机科学家了解人工神经网络如何工作,但他们并不一定知道在处理大数据时的具体细节。阿塔利说, “我们目前对神经网络的理解还不够,比如说,无法准确解释为什么会存在对抗性例子,也不知道如何解决这个问题。”

Part of the problem may relate to the nature of the tasks that existing technologies have been engineered to solve: distinguishing between images of cats and dogs, say. To do this, the technology will process numerous examples of cats and dogs, until it has enough data points to distinguish between them.

部分问题可能与现有技术被设计用来解决的任务的性质有关,例如区分猫和狗的图像。为了做到这一点,神经网络技术将处理大量猫和狗的模样信息,直到有足够的数据点来区分两者。

“The dominant goal of our machine learning frameworks was to achieve a good performance ‘on average’,” says Aleksander Madry, another computer scientist at MIT, who studies the reliability and security of machine learning frameworks. “When you just optimise for being good on most dog images, there will always be some dog images that will confuse you.”

研究机器学习框架可靠性和安全性的麻省理工学院计算机科学家麦德里(Aleksander Madry)说,“我们机器学习框架的主要目标是'就平均值而言'有良好的表现。当机器识别大多数狗的图像表现很好时,你感到鼓舞,但总会有一些狗的图像让机器困惑,无法识别。”

One solution might be to train neural networks with more challenging examples of the thing you’re trying to teach them. This can immunise them against outliers.

或许一种解决方案是用更有挑战性的图像来训练神经网络。这可以使人工神经网络免受异常值的影响。

“Definitely it is a step in the right direction,” says Madry. While this approach does seem to make frameworks more robust, it probably has limits as there are numerous ways you could tweak the appearance of an image or object to generate confusion.

麦德里说, “这无疑是朝着正确方向迈出的一步。”虽然这种方法看起来确实使框架更加强大,但也可能有一些限制,因为有许多方法可以改变图像或物体的外观从而产生混淆。

A truly robust image classifier would replicate what ‘similarity’ means to a human: it would understand that a child’s doodle of a cat represents the same thing as a photo of a cat and a real-life moving cat. Impressive as deep learning neural networks are, they are still no match for the human brain when it comes to classifying objects, making sense of their environment or dealing with the unexpected.

一个真正强大的图像分类器会复制"相似性"对人类的作用,因而可以认出一个孩子涂鸦的猫和一张猫的照片以及一只现实生活中移动的猫代表的是同一样东西。尽管深度学习神经网络令人印象深刻,但在对物体进行分类、感知周遭环境或处理突发事件方面,仍无法与人脑匹敌。

If we want to develop truly intelligent machines that can function in real world scenarios, perhaps we should go back to the human brain to better understand how it solves these issues.

如果我们想要开发出能够在现实世界中发挥作用的真正智能机器,或许我们应该回到人脑上来,更好地理解人脑是如何解决这些问题的。

Binding problem

捆绑问题

Although neural networks were inspired by the human visual cortex, there’s a growing acknowledgement that the resemblance is merely superficial. A key difference is that as well as recognising visual features such as edges or objects, our brains also encode the relationships between those features – so, this edge forms part of this object. This enables us to assign meaning to the patterns we see.

虽然神经网络是受到人类视觉皮层的启发,但越来越多的人认识到这种相似性只是表面现象。一个关键的区别在于,除了识别物体边缘的线条或物体本身等视觉特征外,我们的大脑还对这些特征之间的关系进行编码,因此,物体的边缘就构成了这个物体的一部分。这使我们能够对我们所看到的模式赋予意义。

“When you or I look at a cat, we see all the features that make up cats and how they all relate to one another,” says Simon Stringer of the Oxford Foundation for Theoretical Neuroscience and Artificial Intelligence. “This ‘binding’ information is what underpins our ability to make sense of the world, and our general intelligence.”

“当你或我看着一只猫时,我们看到了构成猫的所有特征,以及它们之间的相互关系,” 牛津大学理论神经科学和人工智能基金会(Oxford Foundation for theory Neuroscience and Artificial Intelligence)的斯特林格(Simon Stringer)如此说。“这种相互'捆绑的'信息是我们理解世界的能力和我们的一般智力的基础。”

This critical information is lost in the current generation of artificial neural networks.

这个起关键作用的捆绑信息在当代的人工神经网络中是缺失的。

“If you haven’t solved binding, you might be aware that somewhere in the scene there is a cat, but you don’t know where it is, and you don’t know what features in a scene are part of that cat,” Stringer explains.

斯特林格解释说, “如果你还没有解决捆绑问题,你可能会意识到场景中的某个地方有一只猫,但你不知道它在哪里,也不知道场景中的哪些特征是这只猫的一部分。”

In their desire to keep things simple, engineers building artificial neural frameworks have ignored several properties of real neurons – the importance of which is only beginning to become clear. Neurons communicate by sending action potentials or ‘spikes’ down the length of their bodies, which creates a time delay in their transmission. There’s also variability between individual neurons in the rate at which they transmit information – some are quick, some slow. Many neurons seem to pay close attention to the timing of the impulses they receive when deciding whether to fire themselves.

为了简单易行,构建当代人工神经框架的工程师忽略了真实人脑神经元的一些特性,而科技界才刚刚开始明白这些特性非常重要。神经元通过将动作电位(action potentials)或“峰电位”(spikes)信号发送到身体的各个部位来进行交流,这就造成了神经元传输的时间延迟。个体神经元之间在传递信息的速度上也有差异,有些快,有些慢。许多神经元在决定是否放电时,似乎会密切关注它们接收到的脉冲的时机。

“Artificial neural networks have this property that all neurons are exactly the same, but the variety of morphologically different neurons in the brain suggests to me that this is not irrelevant,” says Jeffrey Bowers, a neuroscientist at the University of Bristol who is investigating which aspects of brain function aren’t being captured by current neural networks.

“人工神经网络有这个属性,即所有神经元完全相同,但大脑中的神经元却有不同形态,这让我意识到,人脑神经元的差异性不是无关紧要的,”布里斯托大学(University of Bristol)的神经系统科学家鲍尔斯(Jeffrey Bowers)说。他正在调查大脑哪些方面的功能未被当前人工神经网络所采用。

Another difference is that, whereas synthetic neural networks are based on signals moving forward through a series of layers, “in the human cortex there are as many top-down connections as there are bottom up connections”, says Stringer.

另一个不同之处在于,人工合成神经网络是通过一系列人工神经元层向一个方向传输信号。但斯特林格说,“在人类大脑皮层中,自上而下的连接和自下而上的神经元连接是一样多。”

His lab develops computer simulations of the human brain to better understand how it works. When they recently tweaked their simulations to incorporate this information about the timing and organisation of real neurons, and then trained them on a series of visual images, they spotted a fundamental shift in the way their simulations processed information.

为了更好地理解人脑的工作原理,他的实验室对人脑进行了计算机模拟。当他们最近调整他们的模拟,将这些关于真实神经元的时间和组织的信息整合到一起,然后用一系列的视觉图像对计算机进行训练时,他们发现计算机的模拟处理信息的方式发生了根本性的转变。

Rather than all of the neurons firing at the same time, they began to see the emergence of more complex patterns of activity, including the existence of a subgroup of artificial neurons that appeared to act like gatekeepers: they would only fire if the signals they received from related lower- and higher-level features in a visual scene arrived at the same time.

他们开始看到更复杂的活动模式的出现,而不是所有神经元同时放电。其中一个人工神经元子群其作用似乎是信息守门人。这个神经元子群只有在整个系统所接受的某个视觉场景的所有低级和高级特性信息同时到达时才会放电。

Stringer thinks that these “binding neurons” may act like the brain’s equivalent of a marriage certificate: they formalise the relationships between neurons and provide a means of fact-checking whether two signals that appear related really are related. In this way, the brain can detect whether two diagonal lines and a curved line appearing in a visual scene, for example, really represent a feature like a cat’s ear, or something entirely unrelated.

斯特林格认为,这些“捆绑神经元”的行为就像大脑中的结婚证,使神经元之间的关系正式化,并提供了一种方法来验证两个看似相关的信号是否真的相关。通过这种方式,大脑可以检测出现在视觉场景中的两条对角线和一条曲线是否真的代表了一个特征,比如猫的耳朵,或者是完全不相关的东西。

“Our hypothesis is that the feature binding representations present in the visual brain, and replicated in our biological spiking neural networks, may play an important role in contributing to the robustness of biological vision, including the recognition of objects, faces and human behaviours,” says Stringer.

斯特林格说,“我们的假设是,视觉大脑中呈现的捆绑特征,以及在我们的生物强化神经网络中的复制,可能在增强生物视觉的稳健性方面发挥重要作用,包括对物体、面孔和人类行为的识别。”

Stringer’s team is now seeking evidence for the existence of such neurons in real human brains. They are also developing ‘hybrid’ neural networks that incorporate this new information to see if they produce a more robust form of machine learning.

斯特林格的研究小组目前正在寻找证据,证明真实的人类大脑中存在这样的神经元。他们还在开发“混合”神经网络,将这些新信息结合进人工神经网络,看看是否能产生一种更强大的机器学习形式。

“Whether this is what happens in the real brain is unclear at this point, but it is certainly intriguing, and highlights some interesting possibilities,” says Bowers.

鲍尔斯说, “目前还不清楚这是否在真的大脑中发生,但这确实很吸引人,并突出了一些有趣的可能性。”

One thing Stringer’s team will be testing is whether their biologically-inspired neural networks can reliably discriminate between an elderly person falling over in their home, and simply sitting down, or putting the shopping down.

斯特林格的团队将要测试的一件事是,他们受生物大脑启发的神经网络是否能够可靠地区分一个老人是在家中跌倒,或只是坐着不动,或者是正在放下购买的日用品。

“This is still a very difficult problem for today’s machine-vision algorithms, and yet the human brain can solve this effortlessly,” says Stringer. He is also collaborating with the Defence Science and Technology Laboratory at Porton Down, in Wiltshire, England, to develop a next generation, scaled-up version of his neural framework that could be applied to military problems, such as spotting enemy tanks from smart cameras mounted on autonomous drones.

斯特林格说,“对于今天的机器视觉算法来说,这仍然是一个非常困难的问题,然而对人脑则是轻而易举之事。”他还与位于英国威尔特郡(Wiltshire)波顿唐(Porton Down)的国防科技实验室(Defence Science and Technology Laboratory)合作,开发他神经系统框架的下一代增强版。这个增强版可以用于军事,比如从安装在自动无人机上的智能摄像头中发现敌方坦克。

Stringer’s goal is to have bestowed rat-like intelligence on a machine within 20 years. Still, he acknowledges that creating human-level intelligence may take a lifetime – maybe even longer.

斯特林格的目标是在20年内将老鼠等级的智能赋予一台机器。不过他承认创造人类水平的机器智能可能需要一生的时间,甚至更长。

Madry agrees that this neuroscience-inspired approach is interesting approach to solving the problems with current machine learning algorithms.

麦德里同意,受神经科学启示是解决当前机器学习算法问题的有趣方法。

“It is becoming ever clearer that the way the brain works is quite different to how our existing deep learning models work,” he says. “So, this indeed might end up being a completely different path to achieving success. It is hard to say how viable it is and what the timeframe needed to achieve success here is.”

他说,“越来越清楚的是,大脑的工作方式与我们现有的机器深度学习模式非常不同,因此,最终可能会走上一条完全不同的路才能成功。很难说可行性有多大,以及取得成功需要多长时间。”

In the meantime, we may need to avoid placing too much trust in the AI-powered robots, cars and programmes that we will be increasingly exposed to. You just never know if it might be hallucinating.

与此同时,对于越来越多人工智能驱动的机器人、汽车和程序,我们可能需要避免对其过于信任。因为你永远不知道人工智能是不是正在产生被误导的视觉。

“全文请访问纽约时报中文网,本文发表于纽约时报中文网(http://cn.nytimes.com),版权归纽约时报公司所有。任何单位及个人未经许可,不得擅自转载或翻译。订阅纽约时报中文网新闻电邮:http://nytcn.me/subscription/”

相关文章列表