Periodic Reporting for period 1 - LeukoScreen (AI-based leukemia detection in routine diagnostic blood smear data)
Reporting period: 2023-12-01 to 2025-05-31
The LeukoScreen project set out to explore how artificial intelligence (AI) can support clinical diagnosis by automating the evaluation of blood smear images. The goal was to shorten the time from sample collection to diagnosis and treatment, reduce the burden on healthcare professionals, and make expert-level cytological assessment more broadly available – including in low-resource settings.
To achieve this, the project developed cAItomorph, a transformer-based AI model trained on a real-world dataset from the Munich Leukemia Laboratory (MLL), one of Europe’s leading diagnostic centers for blood cancers. The project focused not only on technical performance but also on explainability and clinical utility. It aimed to demonstrate that state-of-the-art AI can identify a wide range of hematological malignancies based on peripheral blood cell morphology, even under the noisy and heterogeneous conditions of routine diagnostics.
Building on previous research from my ERC Consolidator Grant, the project team developed and validated a novel AI model – cAItomorph – using a vision transformer architecture pre-trained on hematology-specific data. The model was trained in a weakly supervised way using only patient-level labels and achieved 68 ± 1% accuracy across seven disease categories, with particularly high sensitivity for acute leukemia and myeloproliferative neoplasms. cAItomorph also demonstrated interpretability through cell- and pixel-level attention maps, revealing diagnostic features in individual cells.
The model was calibrated to provide probability estimates for clinical decision support and evaluated for its potential to reduce unnecessary bone marrow aspirations. A key achievement was showing that the model could lower the false discovery rate in this context from 13.8% to 12% without missing any acute leukemia cases.
A complete manuscript has been finalized and will be submitted to a peer-reviewed open access journal shortly. Code, test data and model weights will be made publicly available to ensure reproducibility and to support follow-up research.
While most prior AI approaches worked only on curated datasets or limited disease classes, cAItomorph generalized to 23 diagnostic entities grouped into 7 categories in a real-world setting. It identified acute leukemia and myeloproliferative neoplasms with high accuracy, explained its predictions in human-understandable ways, and showed clinical utility by reducing unnecessary invasive procedures.
The results have immediate impact for digital pathology and hematology, providing a path toward automation of routine cytology. They also offer a proof-of-concept for deploying foundation models in biomedical diagnostics. For further uptake, next steps include a prospective validation study and assessment of commercialization routes such as licensing to microscope vendors or integration into clinical software.
In addition, the open availability of code and data will allow researchers worldwide to benchmark their methods and improve upon them. With its focus on real-world impact, reproducibility, and translational potential, LeukoScreen lays the groundwork for AI-supported diagnostics in hematology and beyond.