Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

AI-based leukemia detection in routine diagnostic blood smear data

Periodic Reporting for period 1 - LeukoScreen (AI-based leukemia detection in routine diagnostic blood smear data)

Reporting period: 2023-12-01 to 2025-05-31

Blood cancers such as acute myeloid leukemia (AML) are life-threatening conditions that require rapid and accurate diagnosis. A standard diagnostic method is the microscopic analysis of peripheral blood smears. However, this procedure is still performed manually in most laboratories, relying on trained cytologists to classify hundreds of white blood cells by eye. This process is time-consuming, costly, and prone to human error. At the same time, the number of trained experts is decreasing while diagnostic demand is rising.

The LeukoScreen project set out to explore how artificial intelligence (AI) can support clinical diagnosis by automating the evaluation of blood smear images. The goal was to shorten the time from sample collection to diagnosis and treatment, reduce the burden on healthcare professionals, and make expert-level cytological assessment more broadly available – including in low-resource settings.

To achieve this, the project developed cAItomorph, a transformer-based AI model trained on a real-world dataset from the Munich Leukemia Laboratory (MLL), one of Europe’s leading diagnostic centers for blood cancers. The project focused not only on technical performance but also on explainability and clinical utility. It aimed to demonstrate that state-of-the-art AI can identify a wide range of hematological malignancies based on peripheral blood cell morphology, even under the noisy and heterogeneous conditions of routine diagnostics.
The LeukoScreen project successfully collected and processed a real-world dataset of over 3 million single-cell images from over 6000 patients and healthy controls. These images were extracted from routinely scanned blood smears at the MLL and were labeled based on final clinical diagnosis including cytomorphology, immunophenotyping, cytogenetics, and molecular genetics.

Building on previous research from my ERC Consolidator Grant, the project team developed and validated a novel AI model – cAItomorph – using a vision transformer architecture pre-trained on hematology-specific data. The model was trained in a weakly supervised way using only patient-level labels and achieved 68 ± 1% accuracy across seven disease categories, with particularly high sensitivity for acute leukemia and myeloproliferative neoplasms. cAItomorph also demonstrated interpretability through cell- and pixel-level attention maps, revealing diagnostic features in individual cells.

The model was calibrated to provide probability estimates for clinical decision support and evaluated for its potential to reduce unnecessary bone marrow aspirations. A key achievement was showing that the model could lower the false discovery rate in this context from 13.8% to 12% without missing any acute leukemia cases.

A complete manuscript has been finalized and will be submitted to a peer-reviewed open access journal shortly. Code, test data and model weights will be made publicly available to ensure reproducibility and to support follow-up research.
LeukoScreen pushed the boundaries of AI-assisted hematological diagnostics by demonstrating that transformer-based models can extract meaningful diagnostic signals from routine blood smear data – a task traditionally seen as too heterogeneous for reliable automation.

While most prior AI approaches worked only on curated datasets or limited disease classes, cAItomorph generalized to 23 diagnostic entities grouped into 7 categories in a real-world setting. It identified acute leukemia and myeloproliferative neoplasms with high accuracy, explained its predictions in human-understandable ways, and showed clinical utility by reducing unnecessary invasive procedures.

The results have immediate impact for digital pathology and hematology, providing a path toward automation of routine cytology. They also offer a proof-of-concept for deploying foundation models in biomedical diagnostics. For further uptake, next steps include a prospective validation study and assessment of commercialization routes such as licensing to microscope vendors or integration into clinical software.

In addition, the open availability of code and data will allow researchers worldwide to benchmark their methods and improve upon them. With its focus on real-world impact, reproducibility, and translational potential, LeukoScreen lays the groundwork for AI-supported diagnostics in hematology and beyond.
My booklet 0 0