Skip to main content

EXtreme-scale Analytics via Multimodal Ontology Discovery & Enhancement

Deliverables

Algorithms for whole-slide image compression

Algorithms for wholeslide image compression The input of those algorithm is a multiresolution Gigapixel histopathology wholeslide image and the output is a compressed representation of the input in the form of a data volume rows columns features The amount of rows and column of the compressed representation will be much smaller than the original wholeslide image and the number of features will be in the order of 128 or 256 values Encoders will be implemented in the form of a webbased service in the EXA MODE web site which will allow to upload WSIs and download a compressed representation of it in the form of a data fileThe tool is tested in order to verify Milestone 8

Conceptual descriptive framework for multimodal knowledge

This designed conceptual descriptive framework allows medical computer science experts to easily represent into a unified framework multimodal medical information reducing the effort to combine textual and imaging data for clinical decision support systems

2nd dissemination, communication and exploitation report & outline for the following year

This report reviews all the dissemination exploitation and communication activities performed during the year and it draws the outlines for the following year divided per activity

Graph representation of the histopathology knowledge

The histopathology visual knowledge graph represents the relationships between scale and color invariant content of the histopathology images It is described in a scientific publication and it is released to the other members of the consortium

Semantic knowledge extractor prototype

Description This designed semantic knowledge extractor allows to represent textual descriptions of medical reports in terms of semantic networks extracting authoritative concepts and semantic relations out of text The released prototype benefits from the feedbacks of clinicians that supervised the quality of the resultsThe semantic knowledge extractor prototype includes the preliminary version of the visualisation tool prototype with basic functionalities targeting internal use in order to verify Milestone 7

1st dissemination, communication and exploitation report & outline for the following year

This report reviews all the dissemination, exploitation and communication activities performed during the year and it draws the outlines for the following year, divided per activity.

Algorithms for semantic segmentation and detection in histopathology images

Deep neural networks for detection and semantic segmentation of tissue regions in whole-slide images. Targets of trained networks will be released as a milestone during the project. Detection and segmentation algorithms will be implemented in a web-based platform available at Radboudumc, namely CIRRUS Pathology. Segmentation results will be produced in a format compatible with the in-house developed open-source platform ASAP [61], as well as CIRRUS Pathology, which will allow to inspect, modify and further process segmentation results as well as likelihood maps produced by neural networks. Novel techniques of weakly-supervised semantic in whole-slide images will be disseminated as scientific publications. The tool is tested in order to verify Milestone 8.

EXA MODE website

The web site contains the description of the project, a section with a list of publications (and also press releases and articles in the popular press) and a private section for the partners to exchange software, data and information. All public deliverables are available as pdf from the web page. The web site includes an RSS feeder with news on the project and it is updated regularly.

First set of annotated digital pathology data

The first set of annotated whole slide images are made available to the consortium. Such data are selected from the AOEC data and must include at least 100 annotated whole slide images in the final annotated dataset. The annotation requirements are defined by AOEC and MICROSCOPEIT together before the data annotation begins and are made available on the private section of the EXA MODE web page. The data annotations are performed using a software based on previous works of the partners HES-SO and MICROSCOPEIT.

First set of cured, publicly available multimodal and multimedia data

The first set of cured publicly available data include respectively at least 500 and 5’000 histopathology images and related text extracted from scientific literature and the web. The images and the related text include content that is relevant for histopathology diagnostic purposes and that can be used to train machine learning based algorithms.

Set of publicly available algorithms to separate compound images

The algorithms to separate compound images and link them to related text are presented into a scientific publication and they are publicly released on the EXA MODE website.

Tools to extract homogeneous representations of heterogeneous colour visual information

The algorithms and tools are presented into a scientific publication and they are publicly released on the EXA MODE websiteThe prototypes are tested in order to verify Milestone 6

Tools to extract multi-scale representations of visual information

The algorithms and tools to extract multiscale representations of visual information are presented into a scientific publication and they are publicly released on the EXA MODE websiteThe prototypes are tested in order to verify Milestone 6

Final set of cured, publicly available multimodal and multimedia data

The final set of cured publicly available data include respectively at least 500 and 5’000 histopathology images and related text extracted from scientific literature and the web. The images and the related text include content that is relevant for histopathology diagnostic purposes and that can be used to train machine learning based algorithms.

First set of data curated and available

The first set of whole slide images and medical report are made available to the consortium. Such data are selected from the 600’000 AOEC and Radboudumc data according to the consortium requirements.

Searching for OpenAIRE data...

Publications

A Post-Analysis of Query Reformulation Methods for Clinical Trials Retrieval

Author(s): Agosti, Maristella, Giorgio Maria Di Nunzio, and Stefano Marchesin
Published in: SEBD, 2020
Publisher: Ceur-Ws

Multi-Scale Task Multiple Instance Learning for the Classification of Digital Pathology Images with Global Annotations

Author(s): Niccolò Marini, Sebastian Otálora, Francesco Ciompi, Gianmaria Silvello, Stefano Marchesin, Simona Vatrano, Genziana Buttafuoco, Manfredo Atzori, Henning Müller
Published in: MICCAI Workshop on Computational Pathology, 2021
Publisher: Proceedings of Machine Learning Research

Application of Deep Learning Methods to SNOMED CT Encoding of Clinical Texts: From Data Collection to Extreme Multi-Label Text-Based Classification

Author(s): Hristov, Anton, Aleksandar Tahchiev, Hristo Papazov, Nikola Tulechki, Todor Primov, and Svetla Boytcheva
Published in: International Conference on Recent Advances in Natural Language Processing (RANLP 2021), 1.10.2021, 2021, Page(s) 557-565, ISBN 978-954-452-072-4
Publisher: INCOMA Ltd.
DOI: 10.26615/978-954-452-072-4_063

Semi-supervised learning with a teacher-student paradigm for histopathology classification: a resource to face data heterogeneity and lack of local annotations

Author(s): Niccolo Marini, Sebastian Otalora, Henning Muller, and Manfredo Atzori
Published in: International Workshop on Artificial Intelligence for Digital Pathology, International Conference on Pattern Recognition (ICPR), 2021
Publisher: Springer

Data Credit Distribution through Lineage (Extended Abstract)

Author(s): Dennis Dosso and Gianmaria Silvello
Published in: Proc. of the 17th Italian Research Conference on Digital Libraries (IRCDL 2021), 2021
Publisher: Ceur-WS Proceedings

Neural image compression for non-small cell lung cancer subtype classification in H&E stained whole-slide images

Author(s): W. Aswolinskiy, D. Tellez, G. Raya, L. van der Woude, M. Looijen-Salamon, J. van der Laak, K. Grunberg and F. Ciompi
Published in: 2021
Publisher: SPIE Medical Imaging

Neural Feature Selection for Learning to Rank

Author(s): Purpura, Alberto, Karolina Buchner, Gianmaria Silvello, and Gian Antonio Susto
Published in: Advances in Information Retrieval, ECIR 2021, 2021
Publisher: Springer
DOI: 10.1007/978-3-030-72240-1_34

Few-shot weakly supervised detection and retrieval in histopathology whole-slide images

Author(s): M. van Rijthoven, M. Balkenhol, M. Atzori, P. Bult, J. van der Laak and F. Ciompi
Published in: 2021
Publisher: SPIE Medical Imaging

Knowledge Enhanced Representations to Reduce the Semantic Gap in Clinical Decision Support

Author(s): MARCHESIN, STEFANO
Published in: 1, 2019
Publisher: CEUR-WS

What Makes a Query Semantically Hard?

Author(s): G. Faggioli, S. Marchesin
Published in: Proc. of the 2nd International conference on DESIRES, 2021
Publisher: Ceur-Ws Proceedings

On the Formal Standardization of Terminology Resources: The Case Study of TriMED

Author(s): Vezzani, Federica; Di Nunzio, Giorgio Maria
Published in: 1, 2020
Publisher: European Language Resources Association

H&E-adversarial network: a convolutional neural network to learn stain-invariant features through Hematoxylin & Eosin regression

Author(s): Niccolo Marini, Manfredo Atzori, Sebastian Otálora, Stephane Marchand-Maillet, Henning Müller
Published in: IEEE/CVF International Conference on Computer Vision, 2021
Publisher: IEEE

Exploring how to Combine Query Reformulations for Precision Medicine

Author(s): DI NUNZIO, GIORGIO MARIA; MARCHESIN, STEFANO; AGOSTI, MARISTELLA
Published in: 1, 2019
Publisher: NIST

SAFIR: a Semantic-Aware Neural Framework for IR

Author(s): M. Agosti, S. Marchesin, and G. Silvello
Published in: Proceedings of the 11th Italian Information Retrieval Workshop 2021, 2021
Publisher: Ceur-Ws

Incentives for Item Duplication under Fair Ranking Policies

Author(s): Giorgio Maria Di Nunzio, Alessandro Fabris, Gianmaria Silvello and Gian Antonio Susto
Published in: Proc. of the 2nd International Workshop on Algorithmic Bias in Search and Recommendation (BIAS@ECIR2021), 2021
Publisher: Springer
DOI: 10.1007/978-3-030-78818-6

A scalable virtual document-based keyword search system for RDF datasets

Author(s): Dennis Dosso and Gianmaria Silvello
Published in: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 7, 2019
Publisher: ACM Press
DOI: 10.1145/3331184.3331284

Knowledge Enhanced Representations for Clinical Decision Support

Author(s): MARCHESIN, STEFANO; AGOSTI, MARISTELLA
Published in: 1, 2019
Publisher: CEUR-WS

Multimodal latent semantic alignment for automated prostate tissue classification and retrieval

Author(s): Lara, Juan S., Victor H. Contreras O, Sebastián Otálora, Henning Müller, and Fabio A. González
Published in: International Conference on Medical Image Computing and Computer-Assisted Intervention, 2020
Publisher: Springer LCNS
DOI: 10.1007/978-3-030-59722-1_55

As Simple as Possible: Using the R Tidyverse for Multilingual Information Extraction

Author(s): Di Nunzio, Giorgio Maria
Published in: CLEF eHealth 2020, 23.10.2020, 2020
Publisher: Ceur-Ws

Classification of noisy free-text prostate cancer pathology reports using natural language processing

Author(s): Anjani Dhrangadhariya, Sebastian Otálora, Manfredo Atzori, and Henning Muller.
Published in: International Workshop on Artificial Intelligence for Digital Pathology, International Conference on Pattern Recognition (ICPR), 2021
Publisher: Springer

Generalizing convolution neural networks on stain color heterogeneous data for computational pathology

Author(s): Amjad Khan, Manfredo Atzori, Sebastian Otálora, Vincent Andrearczyk, Henning Müller
Published in: Medical Imaging 2020: Digital Pathology, 2020, Page(s) 26, ISBN 9781-510634084
Publisher: SPIE
DOI: 10.1117/12.2549718

A systematic comparison of deep learning strategies for weakly supervised Gleason grading

Author(s): Sebastian Otálora, Manfredo Atzori, Amjad Khan, Oscar Jimenez-del-Toro, Vincent Andrearczyk, Henning Müller
Published in: Medical Imaging 2020: Digital Pathology, 2020, Page(s) 20, ISBN 9781-510634084
Publisher: SPIE
DOI: 10.1117/12.2548571

Exploiting biomedical literature to mine out a large multimodal dataset of rare cancer studies

Author(s): Anjani K. Dhrangadhariya, Oscar Jimenez-del-Toro, Vincent Andrearczyk, Manfredo Atzori, Henning Müller
Published in: Medical Imaging 2020: Imaging Informatics for Healthcare, Research, and Applications, 2020, Page(s) 9, ISBN 9781-510634046
Publisher: SPIE
DOI: 10.1117/12.2549565

An Analysis of Query Reformulation Techniques for Precision Medicine

Author(s): Maristella Agosti, Giorgio Maria Di Nunzio, Stefano Marchesin
Published in: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019, Page(s) 973-976, ISBN 9781-450361729
Publisher: ACM
DOI: 10.1145/3331184.3331289

Probabilistic Word Embeddings in Neural IR - A Promising Model That Does Not Work as Expected (For Now)

Author(s): Alberto Purpura, Marco Maggipinto, Gianmaria Silvello, Gian Antonio Susto
Published in: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, 2019, Page(s) 3-10, ISBN 9781-450368810
Publisher: ACM
DOI: 10.1145/3341981.3344217

Medical Retrieval using Structured Information Extracted from Knowledge Bases

Author(s): Agosti, M.; Di Nunzio, G. M.; Marchesin, S.; Gianmaria Silvello
Published in: Scopus - Elsevier, 1, 2019
Publisher: CEUR-WS

A Study on Reciprocal Ranking Fusion in Consumer Health Search

Author(s): Di Nunzio, Giorgio Maria, Stefano Marchesin, and Federica Vezzani
Published in: CLEF eHealth 2020 Task 2, 23.10.2020, 2020
Publisher: Ceur-Ws

A Bayesian Neural Model for Documents' Relevance Estimation

Author(s): Purpura, Alberto, and Gian Antonio Susto
Published in: """2nd International Conference on Design of Experimental Search Information REtrieval Systems,""", 15.10.2021, 2021
Publisher: Ceur-Ws

A Keyword Search and Citation System for RDF Graphs

Author(s): Dosso, Dennis
Published in: FDIA@ ESSIR, 17.07.2019, 2019
Publisher: Ceur-Ws

Learning from sparsely annotated data for semantic segmentation in histopathology images

Author(s): J.-M. Bokhorst, H. Pinckaers, P. van Zwam, I. Nagetgaal, J. van der Laak and F. Ciompi
Published in: Proceedings of Machine Learning Research, Volume 102, 2019, Page(s) 81-94
Publisher: PMLR

Nanocitation: Complete and Interoperable Citations of Nanopublications

Author(s): Fabris, Erika, Tobias Kuhn, and Gianmaria Silvello
Published in: Italian Research Conference on Digital Libraries, 01.2020, 2020
Publisher: Springer
DOI: 10.1007/978-3-030-39905-4_18

NanoWeb: Search, Access and Explore Life Science Nanopublications on the Web (Extended Abstract)

Author(s): Fabio Giachelle, Dennis Dosso and Gianmaria Silvello
Published in: Proc. 29th Italian Symposium on Advanced Database Systems (SEBD 2021), 2021
Publisher: Ceur-Ws

Semi-weakly supervised learning for prostate cancer image classification with teacher-student deep convolutional networks

Author(s): Otálora, Sebastian, Niccolo Marini, Henning Müller, and Manfredo Atzori
Published in: 2020
Publisher: Lecture Notes in Computer Science book series
DOI: 10.1007/978-3-030-61166-8_21

Extending Unsupervised Neural Image Compression With Supervised Multitask Learning

Author(s): Tellez, David; Hoppener, Diederik; Verhoef, Cornelis; Grunhagen, Dirk; Nierop, Pieter; Drozdzal, Michal; van der Laak, Jeroen; Ciompi, Francesco
Published in: Extending Unsupervised Neural Image Compression With Supervised Multitask Learning, 2020
Publisher: PMLR

Background linking: Joining entity linking with learning to rank models

Author(s): Irrera, O.; Silvello, G.
Published in: Proc. of the 17th Italian Research Conference on Digital Libraries (IRCDL 2021), 1, 2021
Publisher: Ceur-WS Proceedings

Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology

Author(s): David Tellez, Geert Litjens, Péter Bándi, Wouter Bulten, John-Melle Bokhorst, Francesco Ciompi, Jeroen van der Laak
Published in: Medical Image Analysis, 58, 2019, Page(s) 101544, ISSN 1361-8415
Publisher: Elsevier BV
DOI: 10.1016/j.media.2019.101544

Combining weakly and strongly supervised learning improves strong supervision in Gleason pattern classification. BMC Medical Imaging.

Author(s): Sebastian Otálora, Niccolo Marini, Henning Müller, and Manfredo Atzori
Published in: BMC Medical Imaging, 2021, ISSN 1471-2342
Publisher: BioMed Central
DOI: 10.1186/s12880-021-00609-0

Methodology for the standardization of terminological resources: Design of TriMED database to support multi-register medical communication

Author(s): Vezzani, Federica, and Giorgio Maria Di Nunzio
Published in: Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication, 26, 2020, ISSN 0929-9971
Publisher: John Benjamins Publishing Company
DOI: 10.1075/term.00053.vez

Learning Unsupervised Knowledge-Enhanced Representations to Reduce the Semantic Gap in Information Retrieval

Author(s): Maristella Agosti; Stefano Marchesin; Gianmaria Silvello
Published in: ACM TOIS, 2, 2020, ISSN 1046-8188
Publisher: Association for Computing Machinary, Inc.
DOI: 10.1145/3417996

Search, access, and explore life science nanopublications on the Web

Author(s): Fabio Giachelle; Dennis Dosso; Gianmaria Silvello
Published in: PeerJ Computer Science, 1, 2021, ISSN 2376-5992
Publisher: PeerJ Inc.
DOI: 10.7717/peerj-cs.335

Neural Image Compression for Gigapixel Histopathology Image Analysis

Author(s): David Tellez, Geert Litjens, Jeroen van der Laak, Francesco Ciompi
Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, Page(s) 1-1, ISSN 0162-8828
Publisher: Institute of Electrical and Electronics Engineers
DOI: 10.1109/tpami.2019.2936841

Semi-supervised training of deep convolutional neural networks with heterogeneous data and few local annotations: An experiment on prostate histopathology image classification

Author(s): Niccolò Marini; Sebastian Otálora; Henning Müller; Manfredo Atzori
Published in: Crossref, 1, 2021, ISSN 1361-8415
Publisher: Elsevier BV
DOI: 10.1016/j.media.2021.102165

Learning to detect lymphocytes in immunohistochemistry with deep learning

Author(s): Zaneta Swiderska-Chadaj, Hans Pinckaers, Mart van Rijthoven, Maschenka Balkenhol, Margarita Melnikova, Oscar Geessink, Quirine Manson, Mark Sherman, Antonio Polonia, Jeremy Parry, Mustapha Abubakar, Geert Litjens, Jeroen van der Laak, Francesco Ciompi
Published in: Medical Image Analysis, 58, 2019, Page(s) 101547, ISSN 1361-8415
Publisher: Elsevier BV
DOI: 10.1016/j.media.2019.101547

Data Citation and the Citation Graph, accepted to Quantitative Social Sciences

Author(s): Peter Buneman, Dennis Dosso, Matteo Lissandrini, Gianmaria Silvello
Published in: Quantitative Science Studies (QSS), 2022, ISSN 2641-3337
Publisher: MIT Press
DOI: 10.1162/qss_a_00166

Data credit distribution: A new method to estimate databases impact

Author(s): Dennis Dosso, Gianmaria Silvello
Published in: Journal of Informetrics, 14/4, 2020, Page(s) 101080, ISSN 1751-1577
Publisher: Elsevier BV
DOI: 10.1016/j.joi.2020.101080

State-of-the-Art Deep Learning in Cardiovascular Image Analysis

Author(s): Geert Litjens, Francesco Ciompi, Jelmer M. Wolterink, Bob D. de Vos, Tim Leiner, Jonas Teuwen, Ivana Išgum
Published in: JACC: Cardiovascular Imaging, 12/8, 2019, Page(s) 1549-1565, ISSN 1936-878X
Publisher: Elsevier BV
DOI: 10.1016/j.jcmg.2019.06.009

MedTAG: A Portable and Customizable Annotation Tool for Biomedical Documents

Author(s): Fabio Giachelle, Ornella Irrera and Gianmaria Silvello
Published in: BMC Medical Informatics and Decision Making, 2021, ISSN 1472-6947
Publisher: BioMed Central
DOI: 10.1186/s12911-021-01706-4

Staining Invariant Features for Improving Generalization of Deep Convolutional Neural Networks in Computational Pathology

Author(s): Sebastian Otálora, Manfredo Atzori, Vincent Andrearczyk, Amjad Khan, Henning Müller
Published in: Frontiers in Bioengineering and Biotechnology, 7, 2019, ISSN 2296-4185
Publisher: Frontiers
DOI: 10.3389/fbioe.2019.00198

Deep learning-based retrieval system for gigapixel histopathology cases and the open access literature

Author(s): Roger Schaer, Sebastian Otálora, Oscar Jimenez-del-Toro, Manfredo Atzori, Henning Müller
Published in: Journal of Pathology Informatics, 10/1, 2019, Page(s) 19, ISSN 2153-3539
Publisher: Wolters Kluwer Medknow
DOI: 10.4103/jpi.jpi_88_18

Search Text to Retrieve Graphs: A Scalable RDF Keyword-Based Search System

Author(s): Dennis Dosso, Gianmaria Silvello
Published in: IEEE Access, 8, 2020, Page(s) 14089-14111, ISSN 2169-3536
Publisher: Institute of Electrical and Electronics Engineers Inc.
DOI: 10.1109/ACCESS.2020.2966823

Focal elements of neural information retrieval models. An outlook through a reproducibility study

Author(s): Stefano Marchesin, Alberto Purpura, Gianmaria Silvello
Published in: Information Processing & Management, 2019, Page(s) 102109, ISSN 0306-4573
Publisher: Pergamon Press Ltd.
DOI: 10.1016/j.ipm.2019.102109

Deep learning in histopathology: the path to the clinic

Author(s): J. van der Laak, G. Litjens and F. Ciompi
Published in: Nature Medicine, 2021, ISSN 1546-170X
Publisher: Nature
DOI: 10.1038/s41591-021-01343-4

HookNet: Multi-resolution convolutional neural networks for semantic segmentation in histopathology whole-slide images

Author(s): Mart van Rijthoven, Maschenka Balkenhol, Karina Siliņa, Jeroen van der Laak, Francesco Ciompi
Published in: Medical Image Analysis, 2021, ISSN 1361-8415
Publisher: Elsevier BV
DOI: 10.1016/j.media.2020.101890

An Information Visualization Tool for the Interactive Component-Based Evaluation of Search Engines

Author(s): Giacomo Rocco, Gianmaria Silvello
Published in: Digital Libraries: The Era of Big Data and Data Science - 16th Italian Research Conference on Digital Libraries, IRCDL 2020, Bari, Italy, January 30–31, 2020, Proceedings, 1177, 2020, Page(s) 15-25, ISBN 978-3-030-39904-7
Publisher: Springer International Publishing
DOI: 10.1007/978-3-030-39905-4_3

Studying Public Medical Images from the Open Access Literature and Social Networks for Model Training and Knowledge Extraction

Author(s): Henning Müller, Vincent Andrearczyk, Oscar Jimenez del Toro, Anjani Dhrangadhariya, Roger Schaer, Manfredo Atzori
Published in: MultiMedia Modeling - 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II, 11962, 2020, Page(s) 553-564, ISBN 978-3-030-37733-5
Publisher: Springer International Publishing
DOI: 10.1007/978-3-030-37734-2_45

A Framework for Citing Nanopublications

Author(s): Erika Fabris, Tobias Kuhn, Gianmaria Silvello
Published in: Digital Libraries for Open Knowledge - 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Oslo, Norway, September 9-12, 2019, Proceedings, 11799, 2019, Page(s) 70-83, ISBN 978-3-030-30759-2
Publisher: Springer International Publishing
DOI: 10.1007/978-3-030-30760-8_6

Developing Unsupervised Knowledge-Enhanced Models to Reduce the Semantic Gap in Information Retrieval

Author(s): S. Marchesin
Published in: 2020
Publisher: UNIPD
DOI: 10.1145/3476415.3476433