Skip to main content

NewsEye: A Digital Investigator for Historical Newspapers

Deliverables

Project website (to be continuously updated)

The project will maintain a website that will act as a portal for the communications activities. In M1 a web page will be published to advertise and announce the project. By M8 the full website structure will be in place, integrating social media (such as Twitter) channels. The website will be maintained throughout the duration of the project and content will be contributed by all project partners.

Data management plan

The NewsEye project will contribute to the open research data pilot. According to the guidelines for Research Data Management of Horizon 2020 (http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf) a Data Management Plan will be written during the first six months explaining what data will be generated, collected, shared and curated during project duration as well as after the project’s end. It will consider the different kinds of research outcomes (WP6) and data (WP2-5) resulting from the project. One im-portant goal of Newseye is to make its data findable, accessible, interoperable and reusable (FAIR).

Advanced tool to query the enriched data sets (final)

Report on the software to query the data sets (M6). The first version is delivered early on at M6 to allow que-rying the data set as soon as possible, without the semantic enrichment produced in other deliverables of WP3, and the second version at M12 reporting on the software to analyze the data and the enriched data sets is delivered as soon as possible, and allows querying the data set and the enriched data set, including the se-mantic text enrichment to be produced in the rest of WP3 (D3.1-D3.3).

Searching for OpenAIRE data...

Publications

Adaptive Edit-Distance and Regression Approach for Post-OCR Text Correction

Author(s): Thi-Tuyet-Hai Nguyen, Mickael Coustaty, Antoine Doucet, Adam Jatowt, Nhu-Van Nguyen
Published in: Maturity and Innovation in Digital Libraries - 20th International Conference on Asia-Pacific Digital Libraries, ICADL 2018, Hamilton, New Zealand, November 19-22, 2018, Proceedings, Issue 11279, 2018, Page(s) 278-289
DOI: 10.1007/978-3-030-04257-8_29

Evaluating the Impact of OCR Errors on Topic Modeling

Author(s): Stephen Mutuvi, Antoine Doucet, Moses Odeo, Adam Jatowt
Published in: Maturity and Innovation in Digital Libraries - 20th International Conference on Asia-Pacific Digital Libraries, ICADL 2018, Hamilton, New Zealand, November 19-22, 2018, Proceedings, Issue 11279, 2018, Page(s) 3-14
DOI: 10.1007/978-3-030-04257-8_1

Large Scale Analysis of Semantic and Temporal Aspects in Cultural Heritage Collection's Search

Author(s): Sumikawa, Yasunobu; Jatowt, Adam; Doucet, Antoine; Moreux, Jean-Phillippe
Published in: 2019 JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), Urbana-Champaign, Illinois, June 2-6, 2019, Issue yearly, 2019, Page(s) 77-86
DOI: 10.1109/jcdl.2019.00021

Deep Statistical Analysis of OCR Errors for Effective Post-OCR Processing

Author(s): Nguyen, Thi-Tuyet-Hai; Jatowt, Adam; Coustaty, Mickael; Nguyen, Nhu-Van; Doucet, Antoine
Published in: 2019 JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), Urbana-Champaign, Illinois, June 2-6, 2019, Issue yearly, 2019, Page(s) 29-38
DOI: 10.1109/jcdl.2019.00015

Towards Data-Driven Generation of Visualizations for Automatically Generated News Articles

Author(s): Rola Alhalaseh, Myriam Munezero, Miika Leinonen, Leo Leppänen, Jari Avikainen, Hannu Toivonen
Published in: Proceedings of the 22nd International Academic Mindtrek Conference on - Mindtrek '18, Issue yearly, 2018, Page(s) 100-109
DOI: 10.1145/3275116.3275131

An Analysis of the Performance of Named Entity Recognition over OCRed Documents

Author(s): Hamdi, Ahmed; Jean-Caurant, Axel; Sidere, Nicolas; Coustaty, Mickael; Doucet, Antoine
Published in: 2019 JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), Urbana-Champaign, Illinois, June 2-6, 2019, Issue yearly, 2019, Page(s) 333-334
DOI: 10.1109/jcdl.2019.00057

Wortvektoren

Author(s): Laasch, Bastian Marc
Published in: 2018
DOI: 10.18453/rosdok_id00002309