Perception Ultrasound by Learning Sonographic Experience

Project Information

PULSE

Grant agreement ID: 694581

Project website

DOI

10.3030/694581

Project closed

EC signature date 12 October 2016

Start date 1 November 2016

End date 30 April 2023

Funded under

EXCELLENT SCIENCE - European Research Council (ERC)

Total cost

€ 2 462 015,00

EU contribution

€ 2 462 015,00

2 462 015,00

Coordinated by

THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD
United Kingdom

Periodic Reporting for period 5 - PULSE (Perception Ultrasound by Learning Sonographic Experience)

Reporting period: 2022-11-01 to 2023-04-30

Perception Ultrasound by Learning Sonographic Experience (PULSE) explores how the latest ideas from machine learning (deep learning) can be combined with "big data" from clinical sonography to develop new methods and understanding to inform the development of a next generation of ultrasound imaging capabilities that make medical ultrasound more accessible to non-expert clinical professionals. A principal novel element (to our knowledge, unique in the world) is to record expert sonographer gaze, probe movements and the spoken word of sonographers while they scan and this means "artificial intelligence" -based models are built based on, not only the recorded ultrasound video, but as well human perceptual information (this is the human knowledge that goes into the model building process). In total we acquired 1400 full length clinical scans recorded with a purpose-built acquisition system.
To our knowledge this is the first body of work to attempt to bridge the gap between an ultrasound device and the user by employing a machine-learning solution that embeds clinical expert knowledge (through measuring perception and actions) to add interpretation power.
The innovation in PULSE is to apply the latest ideas from machine learning and computer vision to build, from real world training video data, computational models that describe how an expert sonographer performs a diagnostic study of a subject from multiple perceptual cues. Novel machine-learning based computational model designs were developed for different tasks (recognising standard planes, gaze-based and gaze-and-probe-based image and video guidance, describing sonographer actions, describing ultrasound images and video via text, describing sonographer skill, and summarising and characterising clinical workflow) based on probe and eye motion tracking, audio, image processing, and knowledge of how to interpret real-world clinical images and videos acquired to a standardised protocol. The underlying premise of our research is that by building models that more closely mimic how a human makes decisions from ultrasound images, considerably more efficient and powerful assistive interpretation methods can be built than have previously been possible from still US images and videos alone.
The overall objectives of the technical research were:
1. To develop a rich lexicon of sonographer words (vocabularies and languages) to describe US videos, the annotated datasets, and methods and software for accurately and reliably describing real world clinical ultrasound video content.
2. To build methods and software for describing ultrasound video content both for sonographer training and assistive technologies for clinical tasks.
3. To compare automatic description by using combined ultrasound video and probe motion information, and video, probe and eye motion information relative to ultrasound video alone.
The research underpins new multi-modal ultrasound imaging technology that may be developed further to have economic, healthcare and social benefits across Europe and beyond. The focus in the project was on feasibility demonstration. Software methodologies were developed and evaluated on real world obstetric US data in collaboration with clinical experts and trainees to validate the new approaches and to understand what the next translational steps might be towards potential future use in routine US scanning services in hospitals or the community.

We have developed a custom-built dedicated ultrasound-based system for simultaneously acquiring full-scan ultrasound video, gaze tracking data, probe motion data, and sonographer spoken word. The system was based in a hospital clinic and captured data on pregnant women coming for screening scans (first, second of third trimester) and the sonographers who perform the scans. The dataset is unique in the world to our knowledge.
The data was used to both study clinical sonography from a data science perspective for the first time as well as enable technical research on algorithms underpinning assistive tools for clinical sonography tasks which are informed by sonographer perceptions and actions.
In terms of dissemination, results have been presented as papers and keynotes at top academic international medical image analysis conferences as well as obstetrics and gynecology congresses. The work has received conference paper and workshop awards and appeared on the front pages of journals and conference highlights. The work on gaze prediction is being considered for commercial exploitation. The project has trained medical image analysis doctoral students and postdoctoral researchers and clinical fellows in healthcare AI.

Current state of the art medical image analysis methods do not use eye movements to inform decision making in sonography. The underlying premise of PULSE is that using eye tracking and probe motion information to inform image and video recognition algorithm design we can build more useful machine-learning solutions for automatic US video description that more closely mimic human interpretation/actions than models based on video alone.
The PULSE custom-built system allowed us to capture information about key perceptual cues – eye movement and probe motion - lost to conventional image- and video-based interpretation algorithms which only have the video stream of images to work with.
Using multi-modal analysis we studied the visual search strategies employed by full-qualified and trainee sonographers for instance.
We were also interested in questions such as whether trainees and full-qualified sonographers follow different visual search strategies, and whether there are different visual search strategies amongst experts.
Knowledge gleaned from these studies supported developments of assistive technologies to support sonography guidance and image reading/interpretation.
In summary, key outputs from PULSE were in the following areas:
1. Clinical sonography data science - greater understanding of clinical sonography workflow and sonographer skills/skills assessment.
2. Assistive technologies for interpreting ultrasound images - new machine-learning based models to assist in ultrasound standard plane detection and image interpretation.
3. Assistive technologies for ultrasound guidance - New machine-learning based multi-modal models to assist in ultrasound guidance for simple and complex tasks.
4. Video analysis - natural processing language: methodology to allow key information from hard to interpret ultrasound video to be communicated to a non-sonographer via a text-based description.

Logo

Periodic Reporting for period 5 - PULSE (Perception Ultrasound by Learning Sonographic Experience)

Share this page Share this page on social networks

Download Download the content of the page