Skip to main content

Methods for Managing Audiovisual Data: Combining Automatic Efficiency with Human Accuracy


Evaluation report, intermediate version

The intermediate version of the report presents findings of the evaluation of the second iteration of the prototype system, against which the established test scenarios are executed again, along with an evaluation of a first set of quantitative and qualitative criteria. This version includes a final set of recommended improvements to be included in the final MeMaD prototype system.

Data interchange format specification, final version.

This final version of the interchange format specification incorporates feedback from the first and second prototype evaluation cycles and defines the final set of criteria and interchange specifications that that final prototype needs to conform to and will be tested against.

Evaluation report, initial version

The initial version of the prototype system evaluation report, based on feedback provided by a select group of end users. This report will present the first user feedback after execution of test scenarios and will recommend an initial set of improvements to the system specifications and evaluation criteria.

Specification of the data interchange format, initial version

The initial version of the data interchange format will define functional and non-functional requirements of the MeMaD prototype system, based on input concerning the tools developed in WP2, WP3, WP4 and WP5. The requirements are laid out in reference to user requirements and are documented with test scenarios and evaluation criteria.

TV programme annotation model

Report on an initial annotation model for TV programming as well as on the method enabling to go from script and automatic transcription to true subtitles respecting the time and spaces captioning constraints.

Report on discourse-aware machine translation for audio-visual data

A report on neural machine translation models with contextual features beyond sentence boundaries.

Report on multimodal machine translation

A report on models with multimodal input and initial evaluations of their quality.

Specification of the data interchange format, intermediate version

This iteration of the data interchange format updates the specification and future evaluation criteria with feedback and improvements from the first prototype system development and evaluation report.

Report on comparative analysis of human and machine video description

his deliverable will report the main findings from the comparative analysis of human descriptions of audiovisual content with corresponding machine-based descriptions generated in WP2.

Data management plan

Report on the initial data management life cycles for the data to be collected, processed and generated during the project.

Data management plan, update 1

Updated version of DMP that covers significant changes in project datasets and data policies that arise during the project.

Setup of website with presentation of project and consortium partners

Website with presentation of project and consortium partners, initial setup.

Multimodally annotated dataset of described video

This deliverable will provide a) transcriptions of a set of audiovisual materials that have audio description and subtitles in at least one project language and b) annotations of relevant visual, auditory and verbal elements, aligned with the corresponding information in the audio description and subtitles. Contains a report that describes the transcriptions and annotations.

Libraries and tools for multimodal content analysis

A joint collection of tools, libraries and their documentations from Aalto, Eurecom, Lingsoft, LLS and INA.  These are needed in the continuation of this work package and also in the task T6.2 Prototype implementation. Contains a report.

Searching for OpenAIRE data...


Cognate-aware morphological segmentation for multilingual neural translation

Author(s): Grönroos, Stig-Arne; Virpioja, Sami; Kurimo, Mikko
Published in: Third Conference on Machine Translation (WMT18); Brussels, Belgium, 2018, Page(s) 390-397

The WMT'18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English

Author(s): Burlot, Franck; Scherrer, Yves; Ravishankar, Vinit; Bojar, Ondřej; Grönroos, Stig-Arne; Koponen, Maarit; Nieminen, Tommi; Yvon, François
Published in: Third Conference on Machine Translation (WMT18); Brussels, Belgium, 2018, Page(s) 550-564

Two-Stream Part-Based Deep Representation for Human Attribute Recognition

Author(s): Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen
Published in: 2018 International Conference on Biometrics (ICB), Issue Proceedings - 2018 International Conference on Biometrics, ICB 2018, 2018, Page(s) 90-97
DOI: 10.1109/ICB2018.2018.00024

The MeMAD Submission to the IWSLT 2018 Speech Translation Task

Author(s): Sulubacak, Umut; Tiedemann, Jörg; Rouhe, Aku; Grönroos, Stig-Arne; Kurimo, Mikko
Published in: Proceedings of the International Workshop on Spoken Language Translation, 2018, Page(s) 89-94

The MeMAD Submission to the WMT18 Multimodal Translation Task

Author(s): Stig-Arne, Grönroos; Huet, Benoit; Kurimo, Mikko; Laaksonen, Jorma; Merialdo, Bernard; Pham, Phu; Sjöberg, Mats; Sulubacak, Umut; Tiedemann, Jörg; Troncy, Raphaël; Vázquez Carrillo, Juan Raúl
Published in: Proceedings of the Third Conference on Machine Translation (WMT), Volume 2: Shared Task Papers, 2018, Page(s) 609-617

The Aalto system based on fine-tuned AudioSet features for DCASE2018 task2 - general purpose audio tagging

Author(s): Zhicun Xu, Peter Smit, and Mikko Kurimo
Published in: Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018), 2018, Page(s) 24-28

Deep Multimodal Features for Movie Genre and Interestingness Prediction

Author(s): Olfa Ben-Ahmed, Benoit Huet
Published in: 2018 International Conference on Content-Based Multimedia Indexing (CBMI), 2018, Page(s) 1-6
DOI: 10.1109/cbmi.2018.8516504

EURECOM participation in TrecVid VTT 2018

Author(s): Francis, Danny; Huet, Benoit; Merialdo, Bernard
Published in: TRECVID 2018, 22nd International Workshop on Video Retrieval Evaluation, November 13-15, 2018, Gaithersburg, USA, 2018

PicSOM Experiments in TRECVID 2018

Author(s): Mats Sjöberg, Hamed R. Tavakoli, Zhicun Xu, Hector Laria Mantecon and Jorma Laaksonen
Published in: TRECVID 2018, 22nd International Workshop on Video Retrieval Evaluation, November 13-15, 2018, Gaithersburg, USA, 2018

MediaEval 2018: Predicting Media Memorability

Author(s): Cohendet, Romain; Demarty, Claire-Hélène; Duong, Ngoc Q.K.; Sjöberg, Mats; Ionescu, Bogdan; Do, Thanh Toan
Published in: CEUR Workshop Proceedings, Issue 2283, 2018, ISSN 1613-0073

A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network

Author(s): Lorenzo Canale, Pasquale Lisena, Raphaël Troncy
Published in: The Semantic Web – ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I, Issue 11136, 2018, Page(s) 91-107
DOI: 10.1007/978-3-030-00671-6_6