Human eye with Eiffel tower reflected in it

Machine minds are often described as black boxes, their decision-making processes all but inscrutable. But in the case of machine intelligence, researchers are cracking that black box open and peering inside. What they find is that humans and machines don’t pay attention to the same things when they look at pictures – not at all.

Researchers at Facebook and Virginia Tech in Blacksburg got humans and machines to look at pictures and answer simple questions – a task that neural-network-based artificial intelligence can handle. But the researchers weren’t interested in the answers. They wanted to map human and AI attention, in order to shed a little light on the differences between us and them.

“These attention maps are something we can measure in both humans and machines, which is pretty rare,” says Lawrence Zitnick at Facebook AI Research. Comparing the two could provide insight “into whether computers are looking in the right place”.


First, Zitnick and his colleagues asked human workers on Amazon Mechanical Turk to answer simple questions about a set of pictures, such as “What is the man doing?” or “What number of cats are lying on the bed?” Each picture was blurred, and the worker would have to click around to sharpen it. A map of those clicks served as a guide to what part of the picture they were paying attention to.

The researchers then asked the same questions of two neural networks trained to interpret images. They mapped what parts of the picture each network chose to sharpen and explore.

The team found that the attention maps from two humans scored 0.63 on a scale where 1 is total overlap and -1 is none. AI and human attention maps had an overlap score of 0.26 (see image below). Despite this, neural networks are pretty good at deciding what an image shows, so there is an element of mystery to their skill.

Chart of experimental results showing correlation between human and machine areas of focus

“Machines do not seem to be looking at the same regions as humans, which suggests that we do not understand what they are basing their decisions on,” says Dhruv Batra at Virginia Tech.

This gap between humans and machines could be a useful source of inspiration for researchers looking to tweak their neural nets. “Can we make them more human-like, and will that translate to higher accuracy?” Batra asks.

The results intrigue Jürgen Schmidhuber at the Dalle Molle Institute for Artificial Intelligence Research in Manno, Switzerland, although he cautions that researchers shouldn’t necessarily rush to build systems that exactly mimic humans.

“Selective attention is all about actively filling gaps in the attentive observer’s knowledge,” says Schmidhuber. Humans have wider experience and knowledge than neural nets, and so are better at focusing on what matters. “What’s interesting to one system may be boring to another that already knows it.”


More on these topics:

Let’s block ads! (Why?)

Related Posts

Facebook Comments

Return to Top ▲Return to Top ▲