Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Article Category

Content archived on 2023-03-01

Article available in the following languages:

EN

Seeing the need for improved machine perception

"How do we replicate reality?" questions Professor Solomon. How to match machine vision to human capabilities he wonders: "You think I am here. But can we learn from how our brains work to build a device that gives the same impression that someone is really there?"

Solomon, Senior Scientist, from the University of Pennsylvania's Programme on Vision Science and Advanced Networking, believes that there is a kind of mismatch between the attempts of electronic systems to present a picture of what we see, and the way the human eye and brain work together to portray reality. Taking our perception of colour as an example, he says we do not really understand how the human brain receives the information from the eye and understand it as colour. Nature has given us an outstanding visual system, he says, and as yet we are not able to replicate this capability with machine systems. "You cannot create the full colour spectrum from an RGB screen." He illustrates his point of view by explaining how we see the stars in the night sky. The stars on a dim night are in colour, and yet we see them using only the retina's rods - the photoreceptors most sensitive to light and dark changes, shape and movement. In bright light the cones take over. These are the photoreceptors that provide colour vision being the most sensitive to one of three different colours (green, red or blue). He says the change is because as primates we evolved to get the best colour vision at dusk and dawn, traditionally the time of greatest danger for our species when we could have been hunted., ,In another example of the difference between the human eye and machine vision, he points out that the human eye never stops moving. Even when we look at a still image, there is a movement of 10-20 frames per second as the eye drifts across the image. The human retina is capable of basic visual processing on several levels. The retina has around 110 million sensors, he states. Something like 100,000 cones occupy the centre of the retina, the rest is mostly rods. Weaknesses of machine vision ,Machine vision by contrast focuses on edge definition. When an edge appears, even a faint one, the signal strength shoots up, which makes an edge appear sharper. This is the basic principle of High Definition Television (HDTV) he says, to make a picture appear sharper than it really is. The MPEG standard also focuses on edges and movement; it is not good on colour. In that way it only delivers one part of the capability of the human eye. MPEG removes colour information and movement detail in some parts of the spectrum, whereas the human retina can detect movement of less than 1/10 of a second. In addition, he notes, video cameras do not even have the same ability to pick out detail as film cameras. The latest research into the human brain, using direct brain scanning devices such as functional magnetic resonance imaging (fMRI) and positron emission tomograpy (PET), indicates that the human sensory system is much more sensitive to stimuli from visual displays than has previously been understood. It may be that designs for visual display, storage and processing devices will have to be rethought to better match the capabilities of the human perceptual system. "RGB is a form of lossy data compression," says Solomon, which eliminates redundant or unnecessary information. Where human vision systems are capable of seeing luminance, textures, even near infra-red and ultra-violet, he says, RGB fools the human vision system into seeing specific colours, but eliminates all intermediate chromatic information. Even advanced multi-spectral cameras only see a few bands - never the full width of the human vision system. Hue is the most important colour to get right, he believes. Even a seven-colour glossy print uses only a tiny proportion of the total colour spectrum. Compare that to a professional film scanner, he says. He notes that in scanning technology we tend to use only two to three bands of infra-red for professional applications. "What are the possibilities if we could use 200 bands? Could we combine infra-red and ultra-violet to look below the skin for example?" Pure or applied research? ,He believes strongly that research should not be subject to the short-term commercial needs of industry. Focusing on the HDTV standard, he notes that any research done now is in the electronics area only, with a lot of money spent on engineering. There has been little research on the human biology, which would show up the weaknesses of the standard. "What we need to do now is the research," he says. "Don't think about the support of industry. They are driven by their own short-term cycle of turning research into profit within a short time-frame. What we need is the research that will help us figure out what to make! Not to be told what to make by industry!" "What we don't have are the researchers who know how to design the stuff," he goes on. Industry merely wants to sell what they already know, he says, again using the HDTV example as a standard which has been around for nearly 24 years and is still not a sales success. He argues that within 20 years all TV transmission will be via cable or satellite, not over the air. Within the last decade, US subscribers have received only 15 per cent of their broadcasts over the air. This is the basic problem with HDTV, he says. It is a system designed for over-air transmission, and already obsolete because of the technical demands of that medium. "With satellite and cable we can have a much better picture, but we are still working with the standard HDTV image." Right time for new development effort,According to Solomon, we need to develop new techniques in perceptual research. We need to be able to represent images according to human perceptual capabilities, and even to be able to make them responsive to feedback from individuals. Fundamentally, he believes that it is simply not acceptable to discard potential perceptual inputs to the human brain based on faulty assumptions of what we perceive and do not perceive. This is especially the case for critical scientific, medical and engineering objectives, rather than the design of consumer entertainment appliances. The time is right to launch a new development effort, he says, as scientists now have more resources and more sophisticated instruments than they did in the past. The present direction of Japanese research is going the same way as in the US - an effort to find more sophisticated methods of image reproduction. If the EU wants to make a jump ahead, he believes, we should be carrying out the basic research to develop more effective visual media systems. By way of example he refers to the experimental visual reproduction systems he developed at MIT in the US with the support of NASA. The faster the frame rate, the better the resolution, he states, quoting speeds of 60 to 72 frames per second for the experimental system his team developed. Since that time he has put more than 10 years of development into a more advanced variable speed design. Call to support cross-disciplinary research ,Solomon makes a strong appeal for funding mechanisms that will support cross-disciplinary research efforts. "The biggest problem now is the departmental structure of the universities," he says. "A physics professor will get no credit for writing a programme on biology. You're not going to be able to get a Nobel prize. It's good to be affiliated to a university," he continues, "but there are real problems in supporting interdisciplinary areas of research." He says that is still much to understand in attempting to reproduce human perceptual systems. "Human perception relies on more senses than we fully understand," he says. "We can tell when we don't feel well, when we don't feel right. We can sense when someone is standing behind us." The functioning of senses like these could become much better understood once we have developed image reproduction systems that come closer to the capability of the human eye and brain, he believes. And the only way to reach that understanding is by doing the basic research. Source: Based on interview with Professor Richard Jay SolomonThe IST Results service gives you online news and analysis on the emerging results from Information Society Technologies research. The service reports on prototype products and services ready for commercialisation as well as work in progress and interim results with significant potential for exploitation.,

Countries

United States