Skip to main content
Go to the home page of the European Commission (opens in new window)
English en
CORDIS - EU research results
CORDIS

Piloting a Cooperative Open Web Search Infrastructure to Support Europe's Digital Sovereignty

CORDIS provides links to public deliverables and publications of HORIZON projects.

Links to deliverables and publications from FP7 projects, as well as links to some specific result types such as dataset and software, are dynamically retrieved from OpenAIRE .

Deliverables

Dissemination, Exploitation and Communication (DEC) Report V1 (opens in new window)

Dissemination, Exploitation and Communication (DEC) Report first version

ELSA-catalogue & code of conduct for open Web search (opens in new window)

ELSA-catalogue & code of conduct for open Web search initial version

Model governance for federating an open search infrastructure V1 (opens in new window)

Model governance for federating an open search infrastructure Version 1

Report on scientific cooperation, community building and stakeholder involvement V1 (opens in new window)

Report on scientific cooperation, community building and stakeholder involvement initial version

Report of privacy, transparency, and trust models for search applications V1 (opens in new window)

Report of privacy, transparency, and trust models for search applications in its first version

Launch of the Pilot infrastructure (opens in new window)
Crawler Coordination Software Stack & Demonstrator V1 (opens in new window)

Open Source Software Stack for coordinating multiple, distributed and usually independent crawlers.

The OpenWebSearch Hub and the Open Web Index V1 (opens in new window)

The OpenWebSearch Hub and the Open Web Index in a first version indexing common crawls and providing first specifications

Publications

Cross-Market Product-Related Question Answering (opens in new window)

Author(s): Ghasemi, Negin; Aliannejadi, Mohammad; Bonab, Hamed; Kanoulas, Evangelos; de Vries, Arjen P.; Allan, James; Hiemstra, Djoerd
Published in: SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
Publisher: ACM
DOI: 10.1145/3539618.3591658

Adaptive Orchestration of Modular Generative Information Access Systems (opens in new window)

Author(s): Mohanna Hoveyda, Harrie Oosterhuis, Arjen P. de Vries, Maarten de Rijke, Faegheh Hasibi
Published in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
Publisher: ACM
DOI: 10.1145/3726302.3730351

A Longitudinal Study of Content Control Mechanisms (opens in new window)

Author(s): Michael Dinzinger, Michael Granitzer
Published in: Companion Proceedings of the ACM Web Conference 2024, 2024
Publisher: ACM
DOI: 10.1145/3589335.3651893

A User Study on the Acceptance of Native Advertising in Generative IR (opens in new window)

Author(s): Ines Zelch, Matthias Hagen and Martin Potthast
Published in: Proceedings of the 2024 Conference on Human Information Interaction and Retrieval (CHIIR '24), 2024, ISBN 979-8-4007-0434-5
Publisher: ACM
DOI: 10.1145/3627508.3638316

PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents (opens in new window)

Author(s): Saber Zerhoudi, Michael Granitzer
Published in: 2024
Publisher: CEUR-WS
DOI: 10.48550/arXiv.2407.09394

Challenges of Index Exchange for Search Engine Interoperability (opens in new window)

Author(s): Hiemstra, D., Hendriksen, G., Kamphuis, C., & de Vries, A. P.
Published in: Proceedings of 5th International Open Search Symposium (OSSYM2023), 2023, ISBN 978-92-9083-653-7
Publisher: CERN
DOI: 10.5281/zenodo.10529619

Overview of Touché 2023: Argument and Causal Retrieval (opens in new window)

Author(s): Alexander Bondarenko, Maik Fröbe, Johannes Kiesel, Ferdinand Schlatt, Valentin Barriere, Brian Ravenet, Léo Hemamou, Simon Luck, Jan Heinrich Reimer, Benno Stein, Martin Potthast, and Matthias Hagen
Published in: Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2023. Lecture Notes in Computer Science, 2023, ISBN 978-3-031-42447-2
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-42448-9_31

Automating License-Aware Full-Text Retrieval for Systematic Reviews: An End-To-End Scalable System to Reduce Reviewer Workload (opens in new window)

Author(s): Zhuk, D., Sandner, E., Jakovljevic, I., Simniceanu, A., Fontana, L., Henriques, A., Wagner, A., and Gütl, C.
Published in: 7th International Open Search Symposium (OSSYM2025), 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17220241

Manipulating Embeddings of Stable Diffusion Prompts (opens in new window)

Author(s): Niklas Deckers, Julia Peters, Martin Potthast
Published in: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence IJCAI '24, 2024
Publisher: ACM
DOI: 10.24963/IJCAI.2024/845

On Stance Detection in Image Retrieval for Argumentation (opens in new window)

Author(s): Carnot, Miriam Louise; Schreieder, Tobias; Heinemann, Lorenz; Kiesel, Johannes; Braker, Jan; Fröbe, Maik; Potthast, Martin; Stein, Benno
Published in: SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
Publisher: ACM
DOI: 10.1145/3539618.3591917

Overview of PAN 2023: Authorship Verification, Multi-Author Writing Style Analysis, Profiling Cryptocurrency Influencers, and Trigger Detection (opens in new window)

Author(s): Janek Bevendorff, Ian Borrego-Obrador, Mara Chinea-Ríos, Marc Franco-Salvador, Maik Fröbe, Annina Heini, Krzysztof Kredens, Maximilian Mayerl, Piotr Pęzik, Martin Potthast, Francisco Rangel, Paolo Rosso, Efstathios Stamatatos, Benno Stein, Matti Wiegmann,
Published in: Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2023. Lecture Notes in Computer Science, 2023, ISBN 978-3-031-42447-2
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-42448-9_29

Evaluating Generative Ad Hoc Information Retrieval (opens in new window)

Author(s): Gienapp, Lukas; Scells, Harrisen; Deckers, Niklas; Bevendorff, Janek; Wang, Shuai; Kiesel, Johannes; Syed, Shahbaz; Fröbe, Maik; Zuccon, Guido; Stein, Benno; Hagen, Matthias; Potthast, Martin
Published in: SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
Publisher: ACM
DOI: 10.1145/3626772.3657849

The Viability of Crowdsourcing for RAG Evaluation (opens in new window)

Author(s): Lukas Gienapp, Tim Hagen, Maik Fröbe, Matthias Hagen, Benno Stein, Martin Potthast, Harrisen Scells
Published in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
Publisher: ACM
DOI: 10.1145/3726302.3730093

Conceptual Design and Implementation of a Prototype Search Application using the Open Web Search Index (opens in new window)

Author(s): Nussbaumer, A., Kaushik, R., Hendriksen, G., Gürtl, S., & Gütl, C.
Published in: Proceedings of 5th International Open Search Symposium (OSSYM2023), 2023, ISBN 978-92-9083-653-7
Publisher: CERN
DOI: 10.5281/zenodo.10636166

Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-ranking (opens in new window)

Author(s): Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen
Published in: Lecture Notes in Computer Science, Advances in Information Retrieval, 2025
Publisher: Springer Nature Switzerland
DOI: 10.48550/ARXIV.2405.07920

Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation (opens in new window)

Author(s): Shuai Wang; Harrisen Scells; Bevan Koopman; Martin Potthast; Guido Zuccon
Published in: SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, 2023
DOI: 10.48550/arxiv.2309.05238

Score-Fitted Indexes and Constant Length Indexes for Information Retrieval (opens in new window)

Author(s): Djoerd Hiemstra
Published in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
Publisher: ACM
DOI: 10.1145/3726302.3730248

Supporting Vertical Web Search and Customized Search Applications with the Modular and Open Framework MOSAIC

Author(s): Sebastian Gürtl, Alexander Nussbaumer, Christian Gütl
Published in: Proceedings of the 2nd International Workshop on Open Web Search Co-located With the 47th European Conference on Information Retrieval (ECIR 2025), 2025
Publisher: CEUR-WS

Generative Agents Navigating Digital Libraries (opens in new window)

Author(s): Saber Zerhoudi, Michael Granitzer
Published in: Lecture Notes in Computer Science, Sustainability and Empowerment in the Context of Digital Libraries, 2024
Publisher: Springer Nature Singapore
DOI: 10.1007/978-981-96-0865-2_14

Federated Data Infrastructure for the Open Web Search (opens in new window)

Author(s): Fathima, N. A., Golasowski, M., Granitzer, M., Wagner, A., Ariyo, C., Hendriksen, G., Truckenbrodt, J., Mankinen, K., Dinzinger, M., Karlsson, M., Hayek, M., Moiras, S., Vojacek, L., Hachinger, S., & Martinovič, J.
Published in: 2024
Publisher: Zenodo
DOI: 10.5281/ZENODO.13872163

Simulating Follow-up Questions in Conversational Search (opens in new window)

Author(s): Kiesel, J., Gohsen, M., Mirzakhmedova, N., Hagen, M., Stein, B.
Published in: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, Issue 14609, 2024, ISBN 978-3-031-56059-0
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-56060-6_25

Corpus Subsampling: Estimating the Effectiveness of Neural Retrieval Models on Large Corpora (opens in new window)

Author(s): Maik Fröbe, Andrew Parry, Harrisen Scells, Shuai Wang, Shengyao Zhuang, Guido Zuccon, Martin Potthast, Matthias Hagen
Published in: Lecture Notes in Computer Science, Advances in Information Retrieval, 2025
Publisher: Springer Nature Switzerland
DOI: 10.1007/978-3-031-88708-6_29

The Information Retrieval Experiment Platform (opens in new window)

Author(s): Fröbe, Maik; Deckers, Niklas; Stein, Benno; Reimer, Jan Heinrich; Reich, Simon; Hagen, Matthias; MacAvaney, Sean; Bevendorff, Janek; Potthast, Martin
Published in: SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
Publisher: ACM
DOI: 10.48550/ARXIV.2305.18932

Designing an Integration Concept of the Provenance Verification Indicator into Open Web Search Engines (opens in new window)

Author(s): Nussbaumer, Alexander; Ebner, Sylvia M.; Gütl, Christian; Munnelly, Gary; Spillane, Brendan; Conlan, Owen; Plote, Christine; Frank, Anton
Published in: "Proceedings of the 5th International Open Search Symposium #ossym2022", 2022
Publisher: CERN
DOI: 10.5281/ZENODO.8064758

Segmentation of Argumentative Texts by Key Statements for Argument Mining from the Web (opens in new window)

Author(s): Ines Zelch, Matthias Hagen, Benno Stein, Johannes Kiesel
Published in: Proceedings of the 12th Argument mining Workshop, 2025
Publisher: Association for Computational Linguistics
DOI: 10.18653/V1/2025.ARGMINING-1.22

Architecting the Opensearch Service at CERN For OpenWebSearch.EU (opens in new window)

Author(s): Fathima, N. A., Granitzer, M., Dinzinger, M., & Wagner, A.
Published in: 2024
Publisher: Zenodo
DOI: 10.5281/zenodo.13872517

In a Few Words: Comparing Weak Supervision and LLMs for Short Query Intent Classification (opens in new window)

Author(s): Daria Alexander, Arjen P. de Vries
Published in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
Publisher: ACM
DOI: 10.1145/3726302.3730213

Indicative Summarization of Long Discussions (opens in new window)

Author(s): Syed, Shahbaz; Schwabe, Dominik; Al-Khatib, Khalid; Potthast, Martin
Published in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Publisher: ACL
DOI: 10.48550/arxiv.2311.01882

Continuous Integration for Reproducible Shared Tasks with TIRA.io (opens in new window)

Author(s): Maik Fröbe, Matti Wiegmann, Nikolay Kolyada, Bastian Grahm, Theresa Elstner, Frank Loebe, Matthias Hagen, Benno Stein, and Martin Potthast
Published in: Advances in Information Retrieval. 45th European Conference on IR Research (ECIR 2023), 2023, ISBN 978-3-031-28240-9
Publisher: Springer
DOI: 10.1007/978-3-031-28241-6_20

SemEval-2023 Task 5: Clickbait Spoiling (opens in new window)

Author(s): Maik Fröbe, Tim Gollub, Benno Stein, Matthias Hagen, and Martin Potthast
Published in: Proceedings of 17th International Workshop on Semantic Evaluation (SemEval 2023), 2023
Publisher: ACL
DOI: 10.18653/v1/2023.semeval-1.315

An Empirical Comparison of Web Content Extraction Algorithms (opens in new window)

Author(s): Bevendorff, Janek; Kiesel, Johannes; Gupta, Sanket; Stein, Benno
Published in: SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023, ISBN 978-1-4503-9408-6
Publisher: ACM
DOI: 10.1145/3539618.3591920

Efficient Session Search using Topical Index Shards (opens in new window)

Author(s): Hendriksen, G., Hiemstra, D., and de Vries, A. P.
Published in: 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17238317

"Overview of the ""Voight-Kampff'' Generative AI Authorship Verification Task at PAN and ELOQUENT 2025 Working notes"

Author(s): Janek Bevendorff; Yuxia Wang; Jussi Karlgren; Matti Wiegmann; Maik Frobe; Akim Tsivgun; Jinyan Su; Zhuohan Xie; Mervat Abassy; Jonibek Mansurov; Rui Xing; Minh Ngoc Ta; Kareem Ashraf Elozeiri; Tianle Gu
Published in: Working Notes of the Conference and Labs of the Evaluation Forum CLEF 2025, 2025
Publisher: CEUR-WS

The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives (opens in new window)

Author(s): Reimer, Jan Heinrich; Gienapp, Lukas; Schmidt, Sebastian; Scells, Harrisen; Fröbe, Maik; Stein, Benno; Hagen, Matthias; Potthast, Martin
Published in: SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
Publisher: ACM
DOI: 10.48550/ARXIV.2304.00413

MMEAD: MS MARCO Entity Annotations and Disambiguations (opens in new window)

Author(s): Kamphuis, Chris; Lin, Jimmy; Lin, Aileen; de Vries, Arjen P.; Yang, Siwen; Hasibi, Faegheh
Published in: SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
Publisher: ACM
DOI: 10.1145/3539618.3591887

Objective Argument Summarization in Search (opens in new window)

Author(s): Timon Ziegenbein; Shahbaz Syed; Martin Potthast; Henning Wachsmuth
Published in: Lecture Notes in Computer Scienced, Proceedings of Robust Argumentation Machines RATIO 2024, Issue 14638, 2024
Publisher: Springer, Cham
DOI: 10.15488/18759

Team OpenWebSearch at LongEval: Using Historical Data for Scientific Search

Author(s): Daria Alexander, Maik Fröbe, Gijs Hendriksen, Matthias Hagen, Djoerd Hiemstra, Martin Potthast, and Arjen P. de Vries
Published in: Working Notes of the Conference and Labs of the Evaluation Forum CLEF 2025, 2025
Publisher: CEUR-WS

Overview of Touché 2024: Argumentation Systems (opens in new window)

Author(s): Kiesel, J. et al.
Published in: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, Issue 14612, 2024, ISBN 978-3-031-56068-2
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-56069-9_64

Overview of Touché 2025: Argumentation Systems (opens in new window)

Author(s): Johannes Kiesel, Çağrı Çöltekin, Marcel Gohsen, Sebastian Heineking, Maximilian Heinrich, Maik Fröbe, Tim Hagen, Mohammad Aliannejadi, Sharat Anand, Tomaž Erjavec, Matthias Hagen, Matyáš Kopp, Nikola Ljubešić, Katja Meden, Nailia Mirzakhmedova, Vaidas Morkevičius, Harrisen Scells, Moritz Wolter, Ines Zelch, Martin Potthast, Benno Stein
Published in: Lecture Notes in Computer Science, Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2025
Publisher: Springer Nature Switzerland
DOI: 10.1007/978-3-032-04354-2_25

WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval (opens in new window)

Author(s): Michael Dinzinger, Laura Caspari, Kanishka Ghosh Dastidar, Jelena Mitrović, Michael Granitzer
Published in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
Publisher: ACM
DOI: 10.48550/ARXIV.2502.20936

Impact of Tokenization Techniques on URL Classification (opens in new window)

Author(s): Al-Maamari, Mohammed; Mahmoud, Istaiti; Zerhoudi, Saber; Dinzinger, Michael; Granitzer, Michael; Mitrović, Jelena
Published in: 6th International Open Search Symposium (OSSYM2024), 2024
Publisher: CERN
DOI: 10.5281/ZENODO.13863879

Axioms for Retrieval-Augmented Generation (opens in new window)

Author(s): Jan Heinrich Merker, Maik Fröbe, Benno Stein, Martin Potthast, Matthias Hagen
Published in: Proceedings of the 2025 International ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval (ICTIR), 2025
Publisher: ACM
DOI: 10.1145/3731120.3744601

The Two Paradigms of LLM Detection: Authorship Attribution vs. Authorship Verification (opens in new window)

Author(s): Janek Bevendorff, Matti Wiegmann, Emmelie Richter, Martin Potthast, Benno Stein
Published in: Findings of the Association for Computational Linguistics: ACL 2025, 2025
Publisher: Association for Computational Linguistics
DOI: 10.18653/V1/2025.FINDINGS-ACL.194

Overview of PAN 2025: Generative AI Detection, Multilingual Text Detoxification, Multi-author Writing Style Analysis, and Generative Plagiarism Detection (opens in new window)

Author(s): Janek Bevendorff, Daryna Dementieva, Maik Fröbe, Bela Gipp, André Greiner-Petter, Jussi Karlgren, Maximilian Mayerl, Preslav Nakov, Alexander Panchenko, Martin Potthast, Artem Shelmanov, Efstathios Stamatatos, Benno Stein, Yuxia Wang, Matti Wiegmann, Eva Zangerle
Published in: Lecture Notes in Computer Science, Advances in Information Retrieval, 2025
Publisher: Springer Nature Switzerland
DOI: 10.1007/978-3-031-88720-8_64

OWLer: A Distributed and Collaborative Open Web Crawler (opens in new window)

Author(s): Dinzinger, M., Granitzer, M., Mitrović, J., Zerhoudi, S.
Published in: 6th International Open Search Symposium (OSSYM2024), 2024
Publisher: Zenodo
DOI: 10.5281/ZENODO.13863478

ImageCLEF 2025: Multimedia Retrieval in Medical, Social Media and Content Recommendation Applications (opens in new window)

Author(s): Bogdan Ionescu, Henning Müller, Dan-Cristian Stanciu, Ahmad Idrissi-Yaghir, Ahmedkhan Radzhabov, Alba García Seco de Herrera, Alexandra Andrei, Andrea Storås, Asma Ben Abacha, Benjamin Bracke, Benjamin Lecouteux, Benno Stein, Cécile Macaire, Christoph M. Friedrich, Cynthia Sabrina Schmidt, Diandra Fabre, Didier Schwab, Dimitar Dimitrov, Emmanuelle Esperança-Rodier, Gabriel Constantin, Helmut Becker, Hendrik Damm, Henning Schäfer, Ivan Rodkin, Ivan Koychev, Johannes Kiesel, Johannes Rückert, Josep Malvehy, Liviu-Daniel Ștefan, Louise Bloch, Martin Potthast, Maximilian Heinrich, Michael A. Riegler, Mihai Dogariu, Noel Codella, Pål Halvorsen, Preslav Nakov, Raphael Brüngel, Roberto Andres Novoa, Rocktim Jyoti Das, Steven A. Hicks, Sushant Gautam, Tabea M. G. Pakull, Vajira Thambawita, Vassili Kovalev, Wen-Wai Yim, Zhuohan Xie
Published in: Lecture Notes in Computer Science, Advances in Information Retrieval, 2025
Publisher: Springer Nature Switzerland
DOI: 10.1007/978-3-031-88720-8_60

Weighted AUReC: Handling Skew in Shard Map Quality Estimation for Selective Search (opens in new window)

Author(s): Hendriksen, G., Hiemstra, D., de Vries, A.P.
Published in: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, Issue 14611, 2024, ISBN 978-3-031-56065-1
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-56066-8_10

Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins (opens in new window)

Author(s): Lukas Gienapp, Niklas Deckers, Martin Potthast, Harrisen Scells
Published in: Proceedings of the 2025 International ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval (ICTIR), 2025
Publisher: ACM
DOI: 10.1145/3731120.3744594

Citance-Contextualized Summarization of Scientific Papers (opens in new window)

Author(s): Syed, Shahbaz; Hakimi, Ahmad Dawar; Al-Khatib, Khalid; Potthast, Martin
Published in: Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Publisher: ACL
DOI: 10.48550/arxiv.2311.02408

Analyzing Adversarial Attacks on Sequence-to-Sequence Relevance Models (opens in new window)

Author(s): Parry, A., Fröbe, M., MacAvaney, S., Potthast, M., Hagen, M.
Published in: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, Issue 14609, 2024, ISBN 978-3-031-56059-0
Publisher: Springer, Cham
DOI: 10.48550/arXiv.2403.07654

Enriching Science Search with the Open Search Framework MOSAIC (opens in new window)

Author(s): Nussbaumer, A., Gürtl, S., Honeder, J., Hecking, T., & Gütl, C.
Published in: 2024, ISBN 978-92-9083-669-8
Publisher: Zenodo
DOI: 10.5281/ZENODO.13871624

ReNeuIR at SIGIR 2024: The Third Workshop on Reaching Efficiency in Neural Information Retrieval (opens in new window)

Author(s): Maik Fröbe; Joel Mackenzie; Bhaskar Mitra 0001; Franco Maria Nardini; Martin Potthast
Published in: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
Publisher: ACM
DOI: 10.1145/3626772.3657994

Bootstrapped nDCG Estimation in the Presence of Unjudged Documents (opens in new window)

Author(s): Maik Fröbe, Lukas Gienapp, Martin Potthast, and Matthias Hagen
Published in: Advances in Information Retrieval. 45th European Conference on IR Research (ECIR 2023), 2023, ISBN 978-3-031-28243-0
Publisher: Springer
DOI: 10.1007/978-3-031-28244-7_20

Architecting the Datastore for the URL Frontier of OpenWebSearch.eu (opens in new window)

Author(s): Fathima, Noor A.; Dinzinger, Michael; Granitzer, Michael; Wagner, Andreas
Published in: 7th International Open Search Symposium (OSSYM2025), 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17246104

UXSim: Towards a Hybrid User Search Simulation (opens in new window)

Author(s): Saber Zerhoudi, Michael Granitzer
Published in: Proceedings of the 34th ACM International Conference on Information and Knowledge Management, 2025
Publisher: ACM
DOI: 10.1145/3746252.3761640

Resources for Combining Teaching and Research in Information Retrieval Coursework (opens in new window)

Author(s): Maik Fröbe, Harrisen Scells, Theresa Elstner, Christopher Akiki, Lukas Gienapp, Jan Heinrich Reimer, Sean MacAvaney, Benno Stein, Matthias Hagen, Martin Potthast
Published in: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
Publisher: ACM
DOI: 10.1145/3626772.3657886

OWler: Preliminary results for building a Collaborative Open Web Crawler (opens in new window)

Author(s): Dinzinger, M., Al-Maamari, M., Zerhoudi, S., Istaiti, M., Mitrović, J., & Granitzer, M.
Published in: Proceedings of 5th International Open Search Symposium (OSSYM2023), 2023, ISBN 978-92-9083-653-7
Publisher: CERN
DOI: 10.5281/zenodo.10581841

Detecting Generated Native Ads in Conversational Search (opens in new window)

Author(s): Sebastian Schmidt, Ines Zelch, Janek Bevendorff, Benno Stein, Matthias Hagen, Martin Potthast
Published in: Companion Proceedings of the ACM Web Conference 2024, 2024
Publisher: ACM
DOI: 10.1145/3589335.3651489

Understanding and Mitigating Cognitive Bias during Web Search (opens in new window)

Author(s): Hitzginger, S., Nussbaumer, A., Gütl, C., & Ruß-Baumann, C.
Published in: Proceedings of 5th International Open Search Symposium (OSSYM2023), 2023, ISBN 978-92-9083-653-7
Publisher: CERN
DOI: 10.5281/zenodo.10607402

The Open Web Index (opens in new window)

Author(s): Gijs Hendriksen; Michael Dinzinger; Sheikh Mastura Farzana; Noor Afshan Fathima; Maik Fröbe; Sebastian Schmidt; Saber Zerhoudi; Michael Granitzer; Matthias Hagen; Djoerd Hiemstra; Martin Potthast; Benno Stein 0001
Published in: Advances in Information Retrieval: 46th European Conference on Information Retrieval, ECIR 2024, Proceedings, Part V, 2024
Publisher: Springer Nature in Switzerland
DOI: 10.1007/978-3-031-56069-9_10

Trigger Warnings: Bootstrapping a Violence Detector for Fan Fiction (opens in new window)

Author(s): Magdalena Wolska, Matti Wiegmann, Christopher Schröder, Ole Borchardt, Benno Stein, and Martin Potthast
Published in: Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Publisher: ACL
DOI: 10.18653/v1/2023.findings-emnlp.41

Commercialized Generative AI: A Critical Study of the Feasibility and Ethics of Generating Native Advertising Using Large Language Models in Conversational Web Search (opens in new window)

Author(s): Zelch, I., Hagen, M., and Potthast, M.
Published in: Proceedings of 5th International Open Search Symposium (OSSYM2023), 2023, ISBN 978-92-9083-653-7
Publisher: CERN
DOI: 10.48550/arXiv.2310.04892

Beyond Benchmarks: Evaluating Embedding Model Similarity for Retrieval Augmented Generation Systems (opens in new window)

Author(s): Caspari, Laura; Dastidar, Kanishka Ghosh; Zerhoudi, Saber; Mitrovic, Jelena; Granitzer, Michael
Published in: 2024
Publisher: CEUR-WS
DOI: 10.48550/arXiv.2407.08275

Who Will Evaluate the Evaluators? Exploring the Gen-IR User Simulation Space (opens in new window)

Author(s): Johannes Kiesel, Marcel Gohsen, Nailia Mirzakhmedova, Matthias Hagen, Benno Stein
Published in: Lecture Notes in Computer Science, Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2024
Publisher: Springer Nature Switzerland
DOI: 10.5281/ZENODO.18668055

Investigating the Effects of Sparse Attention on Cross-Encoders (opens in new window)

Author(s): Schlatt, F., Fröbe, M., Hagen, M.
Published in: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, Issue 14608, 2024, ISBN 978-3-031-56027-9
Publisher: Springer, Cham
DOI: 10.48550/arXiv.2312.17649

A Comprehensive Dataset for Webpage Classification (opens in new window)

Author(s): Al-Maamari, M., Istaiti, M., Zerhoudi, S., Dinzinger, M., Granitzer, M., & Mitrović, J.
Published in: Proceedings of 5th International Open Search Symposium (OSSYM2023), 2023, ISBN 978-92-9083-653-7
Publisher: CERN
DOI: 10.5281/zenodo.10594210

Web-Scale Retrieval Experimentation with chatnoir-pyterrier (opens in new window)

Author(s): Jan Heinrich Merker, Janek Bevendorff, Maik Fröbe, Tim Hagen, Harrisen Scells, Matti Wiegmann, Benno Stein, Matthias Hagen, Martin Potthast
Published in: Lecture Notes in Computer Science, Advances in Information Retrieval, 2025
Publisher: Springer Nature Switzerland
DOI: 10.1007/978-3-031-88720-8_17

Assessing the Reliability of Human and LLM-Based Screening in Systematic Reviews: A Study on First-Time Reviewers (opens in new window)

Author(s): Sandner, E., Scharf, D., Wautischar, T., Jakovljevic, I., Simniceanu, A., Fontana, L., Henriques, A., Wagner, A., & Gütl, C.
Published in: 7th International Open Search Symposium (OSSYM2025), 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17234055

A Charter for Public Interest Internet Search (opens in new window)

Author(s): Christine Plote, Alexander Nussbaumer
Published in: 7th International Open Search Symposium (OSSYM2025), 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17251787

Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification (opens in new window)

Author(s): Bevendorff et al.
Published in: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, Issue 14613, 2024, ISBN 978-3-031-56071-2
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-56072-9_1

The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments (opens in new window)

Author(s): Nailia Mirzakhmedova, Johannes Kiesel, Milad Alshomary, Maximilian Heinrich, Nicolas Handke, Xiaoni Cai, Valentin Barriere, Doratossadat Dastgheib, Omid Ghahroodi, MohammadAli SadraeiJavaheri, Ehsaneddin Asgari, Lea Kawaletz, Henning Wachsmuth, Benno Stei
Published in: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024
Publisher: ELRA and ICCL
DOI: 10.48550/ARXIV.2301.13771

An Open Source Implementation of Web Clustering Algorithms for Selective Search (opens in new window)

Author(s): Hendriksen, G., Hiemstra, D., & de Vries, A.
Published in: 2024
Publisher: Zanodo
DOI: 10.5281/zenodo.13882966

Smooth Operators for Effective Systematic Review Queries (opens in new window)

Author(s): Scells, Harrisen; Schlatt, Ferdinand; Potthast, Martin
Published in: SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
Publisher: ACM
DOI: 10.1145/3539618.3591768

Geoparsing at Web-scale - Challenges and Opportunities

Author(s): Farzana, Sheikh Mastura; Hecking, Tobias
Published in: GeoExT 2023: First International Workshop on Geographic Information Extraction from Texts at ECIR 2023 (CEUR Workshop Proceedings), Issue 3385, 2023, ISSN 1613-0073
Publisher: CEUR-WS

A Survey of Web Content Control for Generative AI (opens in new window)

Author(s): Michael Dinzinger; Florian Heß; Michael Granitzer
Published in: Proceedings of the First International Workshop on Open Web Search (WOWS 2024), Issue 3689, 2024
Publisher: CEUR-WS
DOI: 10.48550/ARXIV.2404.02309

Product Spam On YouTube: a Case Study (opens in new window)

Author(s): Bevendorff, J., Wiegmann, M., Potthast, M., & Stein, B.
Published in: Proceedings of 5th International Open Search Symposium (OSSYM2023), 2023, ISBN 978-92-9083-653-7
Publisher: CERN
DOI: 10.5281/zenodo.10498306

Adding Retrieval Augmented Generation to the MOSAIC Framework (opens in new window)

Author(s): Holz, Felix; Scharf, Daniel; Alexander, Nussbaumer; Gürtl, Sebastian
Published in: 7th International Open Search Symposium (OSSYM2025), 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17209496

UNFair: Search Engine Manipulation, Undetectable by Amortized Inequity (opens in new window)

Author(s): De Jonge, Tim; Hiemstra, Djoerd
Published in: FAccT 2023 - Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 2023
Publisher: ACM
DOI: 10.1145/3593013.3594046

Pybool_ir: A Toolkit for Domain-Specific Search Experiments (opens in new window)

Author(s): Scells, Harrisen; Potthast, Martin
Published in: SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
Publisher: ACM
DOI: 10.1145/3539618.3591819

Set-Encoder: Permutation-Invariant Inter-passage Attention for Listwise Passage Re-ranking with Cross-Encoders (opens in new window)

Author(s): Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen
Published in: Lecture Notes in Computer Science, Advances in Information Retrieval, 2025
Publisher: Springer Nature Switzerland
DOI: 10.48550/ARXIV.2404.06912

Overview of the “Voight-Kampff” Generative AI Authorship Verification Task at PAN and ELOQUENT 2024

Author(s): Janek Bevendorff, Matti Wiegmann, Jussi Karlgren, Luise Dürlich, Evangelia Gogoulou, Aarne Talman, Efstathios Stamatatos, Martin Potthast, Benno Stein
Published in: Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), 2024
Publisher: CEUR-WS

Ranking Generated Answers (opens in new window)

Author(s): Sebastian Heineking, Jonas Probst, Daniel Steinbach, Martin Potthast, Harrisen Scells
Published in: Lecture Notes in Computer Science, Advances in Information Retrieval, 2025
Publisher: Springer Nature Switzerland
DOI: 10.48550/ARXIV.2408.09831

Systematic Evaluation of Neural Retrieval Models on the Touché 2020 Argument Retrieval Subset of BEIR (opens in new window)

Author(s): Nandan Thakur; Luiz Bonifacio; Maik Fröbe; Alexander Bondarenko 0001; Ehsan Kamalloo; Martin Potthast; Matthias Hagen; Jimmy Lin
Published in: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
Publisher: ACM
DOI: 10.48550/ARXIV.2407.07790

Creating Explainable Summaries for Long Scientific Documents using Large Language Models (opens in new window)

Author(s): Frank, Sarah; Schäffer, Sebastian; Wagner, Andreas; Steinmaurer, Alexander
Published in: 6th International Open Search Symposium (OSSYM2024), 2024
Publisher: CERN
DOI: 10.5281/ZENODO.13871185

Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines (opens in new window)

Author(s): Bevendorff, J., Wiegmann, M., Potthast, M., Stein, B.
Published in: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, Issue 14610, 2024, ISBN 978-3-031-56062-0
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-56063-7_4

In-Browser Agentic Web: a Decentralized Approach to Information Access (opens in new window)

Author(s): Zerhoudi, Saber; Granitzer, Michael
Published in: 7th International Open Search Symposium (OSSYM2025), 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17229737

Using The Open Web Index To Create New Search Applications For Research.fi (opens in new window)

Author(s): Theodoropoulos, Jason; Kesäniemi, Joonas
Published in: 7th International Open Search Symposium (OSSYM2025), 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17238350

Open Web Search at LongEval 2023: Reciprocal Rank Fusion on Automatically Generated Query Variants

Author(s): Maik Fröbe, Gijs Hendriksen, Arjen Paul de Vries, and Martin Potthast
Published in: 2023
Publisher: CEUR-WS.org

Advancing Multimedia Retrieval in Medical, Social Media and Content Recommendation Applications with ImageCLEF 2024 (opens in new window)

Author(s): Ionescu, B. et al.
Published in: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, Issue 14613, 2024, ISBN 978-3-031-56071-2
Publisher: Springer, Cham
DOI: 10.1007/978-3-031-56072-9_6

The German Commons – 154 Billion Tokens of Openly Licensed Text for German Language Models (opens in new window)

Author(s): Lukas Gienapp, Christopher Schröder, Stefan Schweter, Christopher Akiki, Ferdinand Schlatt, Arden Zimmermann, Phillipe Genêt, and Martin Potthast
Published in: 2025
Publisher: CoRR
DOI: 10.48550/ARXIV.2510.13996

Large-Scale Graph Visualisation of Open Web Index and its Evolution in Time (opens in new window)

Author(s): Kateřina Slaninová; Pavlína Smolková
Published in: 7th International Open Search Symposium (OSSYM2025), 2025
Publisher: CERN
DOI: 10.5281/ZENODO.17238127

Prototyping Open Web Search Applications with TIRA: A Case Study in Research-oriented Teaching (opens in new window)

Author(s): Fröbe, M., Elstner, T., Scells, H., Stein, B., & Potthast, M.
Published in: Proceedings of 5th International Open Search Symposium (OSSYM2023), 2023, ISBN 978-92-9083-653-7
Publisher: CERN
DOI: 10.5281/zenodo.10557539

Impact and development of an Open Web Index for Open Web Search (opens in new window)

Author(s): Granitzer Michael; Voigt Stefan; Noor Afshan Fathima; Golasowski Martin; Guetl Christian; Hecking Tobias; Gijs Hendriksen; Djoerd Hiemstra; Jan Martinovič; Jelena Mitrović; Izidor Mlakar; Stavros Moiras; Alexander Nussbaumer; Per Öster; Martin Potthast; Marjana Senčar Srdič; Sharikadze Megi; Kateřina Slaninová; Benno Stein; Arjen P. de Vries; Vít Vondrák; Andreas Wagner; Saber Zerhoudi
Published in: JASIST, 2023, ISSN 2330-1635
Publisher: Willey
DOI: 10.1002/asi.24818

Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins (opens in new window)

Author(s): Lukas Gienapp; Niklas Deckers; Martin Potthast; Harrisen Scells
Published in: Proceedings of the 2025 International ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval (ICTIR), 2025, ISSN 2331-8422
Publisher: arXiv
DOI: 10.48550/ARXIV.2407.21515

Searching for OpenAIRE data...

There was an error trying to search data from OpenAIRE

No results available

My booklet 0 0