Skip to main content

Visually guided grasping and its effects on visual representations

Periodic Reporting for period 1 - VisualGrasping (Visually guided grasping and its effects on visual representations)

Période du rapport: 2018-04-02 au 2020-04-01

In everyday life, we effortlessly grasp and pick up objects without much thought. However, the ease with which we do this belies its computational complexity (see Fig 1, reprinted with permission from [Klein, Maiello et al; 2020]). To pick something up, our brains must work out which locations on the object will lead to stable, comfortable grasps so that we can perform a desired action, such as taking a sip from a cup or writing with a pen. Most potential grasps would actually be unsuccessful, for example, requiring the thumb and forefinger to cross over, or grasping the object too far from its center, so that it slips under its own weight once we try to pick it up. Somehow, the brain has to work out which out of all possible grasps are actually going to succeed. Despite this, we rarely drop objects or find ourselves unable to complete an action because we are holding the object the wrong way.
Vision helps us select, plan, and execute actions, including grasping, that allow us to interact with our environment. To do so, the visual system needs to reconstruct the 3D shape and layout of objects in our surroundings from ambiguous 2D retinal images—a mathematically under-constrained task. Understanding how we use vision to pick up and interact with objects effectively is thus one of the most important challenges in behavioral science. Even state-of-the-art robotic AIs can fail to visually identify effective grasps nearly 20% of the time [Levine et al, 2018].
The main objectives of the project were thus to understand how humans use vision to plan grasping, and to investigate the visual representations along the human dorsal visual stream that are thought to play an important role in grasp planning. We found that humans combine visual information about object 3D shape, orientation and material composition to identify optimal grasping locations across different objects. Furthermore, we were able to characterize the neural computations that the brain employs to reconstruct the 3D shape of objects.

Klein LK, Maiello G, Paulun VC, Fleming RW (2020) Predicting precision grip grasp locations on three-dimensional objects. PLOS Computational Biology 16(8): e1008081.
How do we use our sense of sight to select where and how to grasp objects? To answer this first, challenging question, we employed motion tracking technologies to study how people grasp, pick-up and manipulate objects. By attaching small markers to the hands of paid volunteers, the motion tracking systems recorded at high resolution how they moved their hands while they interacted with objects. Then, we combined these behavioral observations with computer simulations and developed a model capable of predicting where humans would grasp novel objects. This body of work was presented at multiple scientific conferences and finally published in the high-ranking journal PLOS Computational Biology. While conducting this work, we further discovered a novel haptic illusion and a novel constraint on human grasp selection both published in the specialist journal i-Perception.
In parallel to this line of work, we investigated the visual computations that our brain performs to reconstruct the 3D shape and layout of the objects we plan to grasp. To do this, we showed human participants 3D images on stereoscopic monitors, and asked them to report on the perceived depth structure of the stimuli. Then, we constructed an image-computable model of depth processing. This model takes into account how our brain organizes the binocular visual input at our retinae, but operates directly in cortical image space. We tested the model with the same stimuli and procedures used with human participants, and found that the model reported the same depth structure of the environment as humans. This body of work was also published in the high-ranking journal PLOS Computational Biology.
The project has gone beyond its main objectives, and has stemmed several additional avenues of investigation with potentially wide implications for society.
To begin with, the project is pushing technological advancements in disparate engineering fields. For example, our observations on how human grasp selection is affected by object mass and mass distribution have already been implemented in a recent robotics paper [Veres et al, 2020]. In ongoing collaboration with computer science engineers instead, we are investigating how visually guided grasping differs in real and virtual environments, with the goal of designing more immersive and user-friendly virtual and augmented reality technologies.
The work carried out provides important benefits to society in terms of translation to clinical practice. As the population of Western countries ages, neurological and eye diseases causing low vision and visuomotor deficits are becoming increasingly prevalent. Understanding visual function in healthy populations is the first step in understanding the mechanisms of pathological visual loss. Therefore, in clinical collaborations stemmed from the current project, we have already published papers on retinal structure, visual function, and depth processing in glaucoma and age related macular degeneration, two potentially blinding eye diseases. Additionally, our computational account of human grasp selection has been the basis of an imaging study in which we are employing functional magnetic resonance imaging to identify the areas of the brain responsible for distinct aspects of grasp planning. This work does not only expand our knowledge of brain function, it could help identify and treat neurological disorders such as stroke. Similarly, in other ongoing work we are studying how humans use motor imagery, action observation, and sensorimotor feedback to learn how to grasp objects. Since motor imagery and action observation have shown promise in aiding and strengthening motor rehabilitation techniques in a variety of neurological conditions, our model-driven approach could be used to guide and strengthen these neurorehabilitation techniques.
Fig 1 from [Klein, Maiello et al; 2020]: The computational complexity of human grasp selection.