“WHAT am I doing now? How about now? And now?” University students have been popping GoPro video cameras on their heads and filming a first-person view of their daily lives, then asking a computer to interpret it.
Vain though it may sound, the exercise has a point. Researchers want artificial intelligences to understand us better – and teaching them to see the world through our eyes is a good place to start.
“It allows us to indirectly tap into human minds,” says Gedas Bertasius at the University of Pennsylvania in Philadelphia. “It allows us to reason much more accurately about human behaviour – the connection between what we see and how we’re going to act.”
Bertasius and his colleagues are building EgoNet, a neural network system that tries to predict what objects someone might be interested in. Volunteers had to annotate videos of their day-to-day lives frame-by-frame, to show where their attention was focused in each scene. They then fed the footage into a computer and asked EgoNet over and over again to tell them what they were doing. That data helped train it to make predictions, picking out things that a person was about to touch or look at more closely.
EgoNet examines the world through two lenses, backed by separate neural networks. One looks for objects likely to stand out to someone – because of being brightly coloured, say, or being centrally placed in the scene. The other estimates how each object might relate to that person. Is it within …
More on these topics: