EndoMapper: Real-time mapping from endoscopic video

Project Information

EndoMapper

Grant agreement ID: 863146

Project website

DOI

10.3030/863146

Project closed

EC signature date 30 August 2019

Start date 1 December 2019

End date 30 November 2024

Funded under

EXCELLENT SCIENCE - Future and Emerging Technologies (FET)

Total cost

€ 3 693 477,50

EU contribution

€ 3 693 477,50

3 693 477,50

Coordinated by

UNIVERSIDAD DE ZARAGOZA
Spain

Periodic Reporting for period 3 - EndoMapper (EndoMapper: Real-time mapping from endoscopic video)

Reporting period: 2022-06-01 to 2024-11-30

Endoscopes traversing endoluminal cavities, such as the colon, are routine in diagnostic and therapeutic interventions. However, they lack any autonomy. An endoscope operating autonomously in vivo would require real-time cartography of the regions where it is navigating and its location within the map. EndoMapper will develop the fundamentals for real-time localization and mapping inside the human body, using only the video stream supplied by a standard monocular endoscope. Nowadays, there are mature methods for out of the body visual mapping (known as VSLAM, Visual Simultaneous Localization And Mapping). They can deal with images coming from rather different domains such as cars, drones, or wearable devices. However, they perform poorly in gastrointestinal (GI) tract imagery, where non-rigid deformation with poor visual texture is prevalent.

This would complement any automated disease detection framework developed to support clinical decision making, accurate treatment delivery and effective screening regimes. In the short term, EndoMapper will bring to endoscopy live augmented reality, for example, to show to the surgeon the exact location of a tumour that was detected in diagnostic medical imaging, or to provide navigation instructions to reach the exact location where to perform a biopsy. In the longer term, deformable intracorporeal mapping and localization will become the basis for novel medical procedures that could include robotized autonomous interaction with the live tissue in minimally invasive surgery, or automated drug delivery with millimetre accuracy. Unlike other sensing technologies, like electromagnetic (EM) tracking, vision-based mapping will map the entire endoluminal environment in addition to the pose of the scope.

After five years of research and experience with real colonoscopy sequences, we have developed non-rigid and quasi rigid VSLAM fundamentals and methods to produce short term maps, along with multi-mapping, visual localization and topological mapping approaches that enable full colon maps. Although not foreseen in the initial proposal, we have also developed methods that exploit the near-light source and the inverse-square law of illumination decay because they provide rich 3D cues, “darker means farther”, and isophotes provide information about normals. Regarding the techniques conceived for VSLAM we have fulfilled the goal of VSLAM optimization approaches for endoscopy both for geometric sparse features and for photometric optimization. We have also fulfilled the goal of data driven approaches in which automated or self supervision is mandatory. The results include dense depth estimation, discrete feature matching, segmentation and inpainting.

We have researched and produced several independent components of a medical VSLAM system, reaching TRL3. WP1 has completed the EndoMapper dataset the first collection of complete endoscopy sequences acquired during regular medical practice, making secondary use of medical data, to facilitate the development and evaluation of VSLAM in real endoscopy data, it is publicly available at https://doi.org/10.7303/syn26707219. WP2 has concentrated on the key components of an SLAM pipeline: 1) multi-maps to deal with prevalent tracking losses in the short term. 2) complete VSLAM for deforming scenes including the growing mapping as new areas are explored. 3) learning based visual localization and topological mapping of a full colon, 4) dense maps for coverage analysis and 5) real metric scale estimation. WP3 has researched the fundamentals of Non-Rigid SfM under the general model of non-isometric deformation, to complement the previously developed Non-Rigid SfM for tubular topology using classical optimization approaches. Also the estimation of normals by isophotes exploiting the near-light source is considered because normals are quite valuable cues for NRSfM. WP4 has researched learning for feature matching and federated learning methods, able to learn from data obtained at different sites. WP5 has produced learning methods for endoscopy: monocular depth estimation, semantic segmentation, localization and camera motion in deforming scenes. WP6 has tested the monocular depth estimation and tool segmentation in software. WP7 presents 3 medical use cases to demonstrate the potential of VSLAM in endoscopy: 1) Coverage analysis, providing the percentage of visualised mucosa. 2) Intraoperative AR to enhance the physician's spatial awareness and improve the accuracy of navigation within the body. 3) Second visit navigation assistance to reach a localization spotted on the first visit. 4) Polyp measurement from monocular colonoscopy to provide objective measurements of polyps.

Our contributions are already published in top-tier conferences, we provide green access to these publications. We have also filed a patent.

Regarding the exploitation of the results, we are planning to increase the reached TRL3 to TRL6 , as the next step in the path to the market. For this purpose we have assembled a consortium formed by Unizar, including its medical team, and ODIN Vision, and have made a proposal to the EiC Transition Program. We have selected EndoMapper results at TRL3 as the basis of the VSLAM technology development at TLR6, the ultimate goal is to transform any endoscope into a smart device for intraoperative 3D localization navigation and mapping. The proposal, EndoCartoScope GA 101211633, has been granted and is scheduled to start on 1st July 2025.

As expected, the cross-fertilized VSLAM methods with weakly-supervised and unsupervised machine learning have yielded a new generation of VSLAM methods operating in monocular endoscopy, paving the way to transform any endoscope into a smart 3D device.

Scientific impact is expected because we focus on a specific case with great potential of generalization. The new methods very likely are going to be extended to non-medical domains. Within the medical arena, we focus on the GI tract, but the results can be extended to other anatomical regions. We focus on the simplest and standard monocular camera case. If more complex cameras such as stereo, RGB-D or IMU aided become standard in endoscopy, the developed methods are still valid, in fact, they can greatly exploit these additional pieces of information if available.

The technical impact can be significant. Monocular endoscopes will be transformed into smart perception devices for autonomous operation. In the short term, VSLAM will provide assistance in endoscopic procedures: coverage, measurements and augmented reality. In the long run they will unlock autonomy in medical robotics, making possible procedures currently unrealisable.

From an economic and social point of view, the results can push further European excellence in delivering quality health care at a competitive cost. TIC and robotics open an opportunity to produce medical devices that incorporate the accumulated medical know-how. This will lower the cost of universal health services, and also open business opportunities in medical technology. Europe is a global leader in medical instrumentation and robotics, hence EndoMapper will open opportunities for European-level research and product development. From a social point of view, it will boost personalized and robotized medicine able to produce better health care at a lower cost, for the benefit of all citizens.

endomapper-figure.jpg

Periodic Reporting for period 3 - EndoMapper (EndoMapper: Real-time mapping from endoscopic video)

Share this page Share this page on social networks

Download Download the content of the page