European Commission logo
English English
CORDIS - EU research results
CORDIS

Integrated and Detailed Image Understanding

Article Category

Article available in the following languages:

Teaching AI to teach itself

New algorithms aim to give artificial intelligence the ability to not only identify objects, but also interpret what it sees.

Digital Economy icon Digital Economy

Although humans use vision for pretty much everything we do, it’s often taken for granted. Yet being able to make sense of an image is an extraordinarily complex process. In fact, some researchers estimate that vision uses about half of the brain. “This complex process lets us not only see a car, but a blue car, not only a person, but a man wearing a red T-shirt,” says Andrea Vedaldi, a professor of Computer Vision and Machine Learning at the University of Oxford. According to Vedaldi, this detailed understanding of what we see is critical to decision-making. “If we see a red light and another vehicle not slowing down, we immediately interpret this as a potentially dangerous situation and act accordingly,” he adds. And here lies the key problem with artificial intelligence (AI). While AI does a fairly good job at identifying objects, it lacks the ability to interpret what it sees – which can be rather problematic in such applications as autonomous vehicles or pilotless drones. “Whereas babies learn to understand images by themselves, with little to no external inputs, AI must be taught this skill by extensive and detailed manual supervision,” explains Vedaldi. With the support of the EU-funded IDIU project, Vedaldi and his team of researchers are working to do exactly that. “Our goal was to develop a new generation of image-understanding algorithms with a power and flexibility closer to human vision,” he notes.

No supervision needed

The IDIU project, which received support from the European Research Council (ERC), addresses one of the major bottlenecks of modern computer vision: the need for supervision. Although algorithms can learn to solve complex image analysis tasks, to do so, they first require thousands – if not millions – of labelled examples, essentially images that are manually annotated with their interpretation. Needless to say, this comes at a significant cost. To streamline this process, researchers developed several new technologies, including algorithms that can ‘do their own research’. They can do this by automatically consulting such internet resources as Google and Wikipedia, as well as via a new mathematical approach to learning the geometry of objects in images and videos without the need for an external source of supervision. “For the first time, we demonstrated that it is possible to learn the spatial structure of objects just by looking at images, without any external supervision,” says Vedaldi. “In other words, an algorithm can independently learn that a person has two arms, two legs and a certain pose.”

Human-like flexibility

While AI is still a long way from matching human intelligence, the developments achieved by the IDIU project give it a human-like level of flexibility. “By pioneering a new sub-area of AI, which we call internal learning, this project will have a major impact on future research and industry,” he adds. This impact is already happening, as the project’s results are currently being utilised in an ERC Consolidator Grant. “Using the IDIU findings as a foundation, we are now building machines that can learn to see completely automatically via the passive ingestion of casually recorded images and videos,” concludes Vedaldi. “We expect that this technology will make computer vision much more easily applicable and thus flexible and useful to many of tomorrow’s critical applications.”

Keywords

IDIU, artificial intelligence, AI, algorithms, computer vision, machine learning

Discover other articles in the same domain of application