Perceptual, Contextual, and Cross-modal Learning in Hearing and Vision

Final Report Summary - LEARN 2 HEAR & SEE (Perceptual, Contextual, and Cross-modal Learning in Hearing and Vision)

Summary of project objectives:

The main goal of this research program was to develop a transatlantic collaboration to study the processes of learning that result in improved human ability to perceive characteristics of objects in their sensory environments. The collaborating groups combined behavioural experiments, imaging studies and computational modelling to study 1) the mechanisms of learning in hearing and vision, and 2) the crossmodal factors that influence these learning processes. The collaboration was established between research groups at two European and two US institutions with the following complimentary expertise:

- Safarik University in Kosice – behavioural and modelling examination of human and non-human spatial auditory perception,
- University of Edinburgh – neural modelling of visual perceptual learning and attention,
- UC Riverside – psychophysical, imaging and modelling studies of perceptual learning and spatial perception in humans,
- Boston University – psychophysics, modelling, and imaging studies of spatial hearing.

Two main directions of research were pursued, each focusing on the processes of learning in one sensory domain:

1. Learning in spatial hearing and distance perception: Safarik Unviersity, UC Riverside, Boston University,
2. Perceptual learning and attention in vision: UC Riverside and Unviersity of Edinburgh.

Description of performed work

In the auditory domain, multiple experiments and modelling studies were performed. Two experiments examined the learning processes underlying the human ability to adapt to the acoustics of different rooms when judging the distance of auditory stimuli. Another set of experiments examined the effect of visual stimuli on auditory spatial perception (i.e. the ventriloquism effect) in the distance dimension. Also, we used functional magnetic resonance imaging (fMRI) and psychophysics to examine the brain areas involved in auditory distance processing. Finally, two modelling studies were performed, one examining what information the humans use when estimating distance of nearby sound sources in regular reverberant rooms, and one looking at talker localization in complex multi-talker environments.

In the visual domain, three experiments and three modeling studies were performed. The experiments examined how subjects learn and unlearn the statistics of simple visual stimuli and how their expectations influence perception. The modeling work explored whether participants' behaviour can be described in terms of probabilistic inference and aimed at characterising the neural substrate of such plasticity.

Main results:

Auditory studies:

In the first auditory experiment we showed that room-specific learning in auditory distance perception occurs when listeners are spontaneously judging distance in a specific room over the course of several days. We also showed that this learning is faster if the listeners are instructed to focus on what distance information they can extract from the room reflections of the sound.

In the second auditory experiment we showed that the process of room learning for distance perception is strongly influenced by the type of stimuli to which the listener is initially exposed. If the stimuli provide a lot of room-related and consistent un-related information, then quick learning occurs. On the other hand, if it is difficult to create an association between the stimuli and room characteristics, then the learning is much slower.

In the third experiment we were, to our knowledge, the first ones to identify the human brain areas responsible for processing of auditory distance information. We used the functional Magnetic Resonance Imaging technique and virtual acoustics technique to simulate sources of varying distance.

A fourth series of experiments showed a strong effect of visual stimuli on auditory distance perception. These experiments for the first time examined systematically compared ventriloquism effect and aftereffect in the auditory distance domain.

Modelling work presented in Kopco & Shinn-Cunningham (2011) found that, in regular rooms, listeners only use the room reflection cues to judge distance, even though more reliable “binaural” cues are available in this environment. This result is important because a part of the learning/adaptation process the listeners undergo in new rooms is related to how they switch between individual cues as they move from one room to another.

Finally, we analyzed and modelled data from the auditory cortex of pallid bats. We found evidence for systematic representations of sound azimuth within individual binaural clusters in the pallid bat A1.

Visual studies:

In the first experiment, we showed that human participants quickly and unconsciously develop expectations for simple visual stimuli (in our experiment, the direction of motion of a cloud of random dots). Such expectations lead to better and faster detection of the expected stimulus, but also to biases and hallucinations for other stimuli: in general, other stimuli tended to be perceived as being more similar to the expected stimulus than they really were.

We have also investigated whether long-term expectations regarding motion of visual objects could be changed over a few days of learning. It had been previously demonstrated that human subjects had prior expectations that visual objects are static or move slowly. These expectations are thought to result from a lifetime of exposure to natural scene statistics and to be responsible for a number of biases and visual illusions. We found that such expectations are also plastic and could be changed over the course of a few experimental training sessions.

The third experiment showed that the complexity of the statistics that could be learned was limited: in some situations where two sets of stimuli with different statistics are presented in interleaved trials, subjects fail to learn the two statistics simultaneously.

Final results and their potential impact and use:

Humans and other living organisms are constantly exposed to new stimuli and environments. In order to correctly respond in such situations, they must recalibrate their perceptual processing in new environments and learn to recognize new stimuli and situations. The current results elucidate several aspects of perceptual learning and recalibration in normal healthy humans. These results can have broad socio-economic impact, because they help us answer the questions:

- How adaptive our healthy perceptual systems are?
- What methods can be used to re-learn/re-calibrate hearing or vision in humans in which these perceptual systems are impaired?
- What methods can be used to improve the sensory systems for the elderly in which these systems gradually deteriorate?
- How to develop new technologies, e.g. for communication and collaboration in complex virtual environments?

Final Report Summary - LEARN 2 HEAR & SEE (Perceptual, Contextual, and Cross-modal Learning in Hearing and Vision)

Share this page

Download