Methods for Managing Audiovisual Data: Combining Automatic Efficiency with Human Accuracy

Leistungen

The intermediate version of the report presents findings of the evaluation of the second iteration of the prototype system, against which the established test scenarios are executed again, along with an evaluation of a first set of quantitative and qualitative criteria. This version includes a final set of recommended improvements to be included in the final MeMaD prototype system.

Data interchange format specification, final version.

This final version of the interchange format specification incorporates feedback from the first and second prototype evaluation cycles and defines the final set of criteria and interchange specifications that that final prototype needs to conform to and will be tested against.

Evaluation report, initial version

The initial version of the prototype system evaluation report, based on feedback provided by a select group of end users. This report will present the first user feedback after execution of test scenarios and will recommend an initial set of improvements to the system specifications and evaluation criteria.

Summary of dissemination and communication activities

Summary of dissemination and communication activities taken by partners, highlighting successful generation of new research and commercial projects and collaborations.

Specification of the data interchange format, initial version

The initial version of the data interchange format will define functional and non-functional requirements of the MeMaD prototype system, based on input concerning the tools developed in WP2, WP3, WP4 and WP5. The requirements are laid out in reference to user requirements and are documented with test scenarios and evaluation criteria.

TV programme annotation model

Report on an initial annotation model for TV programming as well as on the method enabling to go from script and automatic transcription to true subtitles respecting the time and spaces captioning constraints.

Report on discourse-aware machine translation for audio-visual data

A report on neural machine translation models with contextual features beyond sentence boundaries.

Report on cross-lingual content retrieval based on automatic translation

A report on the use of machine translation in cross-lingual retrieval of audio-visual content.

Report on multimodal machine translation

A report on models with multimodal input and initial evaluations of their quality.

Best Practice Guide for Video Description

This practical guide will outline principles and models of video description drawing on insights about multimodal translation of audiovisual content and the empirical analysis conducted in this WP.

Specification of the data interchange format, intermediate version

This iteration of the data interchange format updates the specification and future evaluation criteria with feedback and improvements from the first prototype system development and evaluation report.

Evaluation report, final version

This revision of the report will contain a final evaluation of the MeMaD prototype system, performed using a complete set of quantitative and qualitative test criteria for each of the project’s use cases.

Report on comparative analysis of human and machine video description

his deliverable will report the main findings from the comparative analysis of human descriptions of audiovisual content with corresponding machine-based descriptions generated in WP2.

Data management plan

Report on the initial data management life cycles for the data to be collected, processed and generated during the project.

Data management plan, update 1

Updated version of DMP that covers significant changes in project datasets and data policies that arise during the project.

Data management plan, update 2

Final version of DMP that covers significant changes in project datasets and data policies that arise during the project.

TV moments detection and linking, final version

Updated implementations of the back-end microservices of MeMAD. These will cover the full span of functionalities foreseen in WP3: annotations with a broader set of entity types, diverse enrichments triggered by a more accurate moment detection. The deliverable will include a report describing the microservices and their evaluation results.

Implementations of methods adapted to enhanced human inputs

Further improvements and additions to the tools and libraries contained in D2.1. These will include documentations and actual methods for speaker segmentation and diarization as well as for visual content analysis of video footage. Contains a report.

Collection of annotations and / or video data resulting from the project

A collection of datasets and media corpora that will act as the project legacy datasets. These will be stored to relevant data repositories following the project DMP. Contains a report that describes the collection.

Multimodally annotated dataset of described video

This deliverable will provide a) transcriptions of a set of audiovisual materials that have audio description and subtitles in at least one project language and b) annotations of relevant visual, auditory and verbal elements, aligned with the corresponding information in the audio description and subtitles. Contains a report that describes the transcriptions and annotations.

TV moments detection and linking, initial version

The first implementation version of the back-end microservices developed in MeMAD. They will cover initial annotation and enrichment services attached to simple highlight moments extracted from programmes. Contains a report that describes the microservices.

Software and demonstration of human-like content description generation

Final versions of the developed visual and aural tools for multimodal content description, combined with standalone demonstrations and documentation in a report. The methods aim at referring to the recurrent objects and persons in the described media contents in human-like intelligent ways.

Libraries and tools for multimodal content analysis

A joint collection of tools, libraries and their documentations from Aalto, Eurecom, Lingsoft, LLS and INA. These are needed in the continuation of this work package and also in the task T6.2 Prototype implementation. Contains a report.

Tools and models for multimodal, multilingual and discourse-aware MT

A release of tools and pre-trained models described in D4.1 and D4.2 with a report containing the documentation and user guidelines.

Setup of website with presentation of project and consortium partners

Website with presentation of project and consortium partners, initial setup.

Final website with presentation of project and consortium partners

Final version of Website with presentation of project and consortium partners.

Veröffentlichungen

MEMAD Project: End User Feedback on AI in the Media Production Workflows

Autoren: Lauri Saarikoski, Dieter Van Rijsselbergen, Maija Hirvonen, Maarit Koponen, Umut Sulubacak, Kaisa Vitikainen
Veröffentlicht in: Proceedings of IBC 2020, 2020
Herausgeber: IBC

OpusTools and Parallel Corpus Diagnostics

Autoren: Mikko Aulamo, Umut Sulubacak, Sami Virpioja, Jörg Tiedemann
Veröffentlicht in: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), 2020, Seite(n) 3782-3789, ISBN 979-10-95546-34-4
Herausgeber: European Language Resources Association (ELRA)

MT for Subtitling : Investigating professional translators’ user experience and feedback

Autoren: Maarit Koponen, Umut Sulubacak, Kaisa Vitikainen, Jörg Tiedemann
Veröffentlicht in: Proceedings of the 14th Conference of the Association for Machine Translation in the Americas October 6 - 9, 2020 : 1st Workshop on Post-Editing in Modern-Day Translation, 2020, Seite(n) 79-92
Herausgeber: Association for Machine Translation in the Americas

Deep Contextual Attention for Human-Object Interaction Detection

Autoren: Tiancai Wang, Rao Muhammad Anwer, Muhammad Haris Khan, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Jorma Laaksonen
Veröffentlicht in: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, Seite(n) 5693-5701, ISBN 978-1-7281-4803-8
Herausgeber: IEEE
DOI: 10.1109/iccv.2019.00579

North Sámi morphological segmentation with low-resource semi-supervised sequence labeling

Autoren: Stig-Arne Grönroos, Sámi Virpioja, Mikko Kurimo
Veröffentlicht in: Proceedings of the Fifth International Workshop on Computational Linguistics for Uralic Languages, 2019, Seite(n) 15-26
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/w19-0302

Named Entity Recognition for Spoken Finnish

Autoren: Dejan Porjazovski, Juho Leinonen, Mikko Kurimo
Veröffentlicht in: Proceedings of the 2nd International Workshop on AI for Smart TV Content Production, Access and Delivery, 2020, Seite(n) 25-29, ISBN 9781450381468
Herausgeber: ACM
DOI: 10.1145/3422839.3423066

The University of Helsinki Submissions to the WMT19 News Translation Task

Autoren: Aarne Talman, Umut Sulubacak, Raúl Vázquez, Yves Scherrer, Sami Virpioja, Alessandro Raganato, Arvi Hurskainen, Jörg Tiedemann
Veröffentlicht in: Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 2019, Seite(n) 412-423, ISBN 978-1-950737-27-7
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/w19-5347

The University of Helsinki Submission to the WMT19 Parallel Corpus Filtering Task

Autoren: Raúl Vázquez, Umut Sulubacak, Jörg Tiedemann
Veröffentlicht in: Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), 2019, Seite(n) 294-300
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/w19-5441

The University of Helsinki Submissions to the WMT19 Similar Language Translation Task

Autoren: Yves Scherrer, Raúl Vázquez, Sami Virpioja
Veröffentlicht in: Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), 2019, Seite(n) 236-244
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/w19-5432

VIREO-EURECOM @ TRECVID 2019: Ad-hoc Video Search (AVS)

Autoren: Phuong Anh Nguyen, Jiaxin Wu, Chong-Wah Ngo, Francis Danny, Benoit Huet
Veröffentlicht in: TRECVID 2019, 23rd International Workshop on Video Retrieval Evaluation, 12-13 November 2019, Gaithersburg, MD, USA, 2019
Herausgeber: NIST

Fusion of Multimodal Embeddings for Ad-Hoc Video Search

Autoren: Danny Francis, Phuong Anh Nguyen, Benoit Huet, Chong-Wah Ngo
Veröffentlicht in: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019, Seite(n) 1868-1872, ISBN 978-1-7281-5023-9
Herausgeber: IEEE
DOI: 10.1109/iccvw.2019.00233

Speaker Verification Experiments for Adults and Children Using Shared Embedding Spaces

Autoren: Tuomas Kaseva, Hemant Kathania, Aku Rouhe and Mikko Kurimo
Veröffentlicht in: Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), Ausgabe 178:9, 2021, Seite(n) 86-93, ISBN 978-91-7929-614-8
Herausgeber: Linköpings universitet

TOMODAPI: A Topic Modeling API to Train, Use and Compare Topic Models

Autoren: Pasquale Lisena, Ismail Harrando, Oussama Kandakji, Raphael Troncy
Veröffentlicht in: Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS), 2020, Seite(n) 132-140
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/2020.nlposs-1.19

MT for subtitling : User evaluation of post-editing productivity

Autoren: Maari Koponen, Umut Sulubacak, Kaisa Vitikainen, Jörg Tiedemann
Veröffentlicht in: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT 2020), 2020, Seite(n) 115-124, ISBN 978-989-33-0589-8
Herausgeber: European Association for Machine Translation

End-to-end and HMM/DNN ASR in an equal data setting: A Finnish case study

Autoren: Aku Rouhe, Astrid Van Camp, Mittul Singh, Hugo Van Hamme, Mikko Kurimo
Veröffentlicht in: Proceedings of Interspeech, 2021
Herausgeber: International Speech Communication Association

Attention-Based End-To-End Named Entity Recognition From Speech

Autoren: Dejan Porjazovski, Juho Leinonen, Mikko Kurimo
Veröffentlicht in: Text, Speech, and Dialogue - 24rd International Conference, TSD 2020, Brno, Czech Republic, 2021
Herausgeber: Springer

Using Artificial Intelligence to Preserve Audiovisual Archives - New Horizons, More Questions

Autoren: Jean Carrive
Veröffentlicht in: Proceedings of the 27th ACM International Conference on Multimedia, 2019, Seite(n) 1-2, ISBN 9781450368896
Herausgeber: ACM
DOI: 10.1145/3343031.3349583

L-STAP: Learned Spatio-Temporal Adaptive Pooling for Video Captioning

Autoren: Danny Francis, Benoit Huet
Veröffentlicht in: Proceedings of the 1st International Workshop on AI for Smart TV Content Production, Access and Delivery - AI4TV '19, 2019, Seite(n) 33-41, ISBN 9781450369176
Herausgeber: ACM Press
DOI: 10.1145/3347449.3357484

Finnish ASR with Deep Transformer Models

Autoren: Abhilash Jain, Aku Rouhe, Stig-Arne Grönroos, Mikko Kurimo
Veröffentlicht in: Interspeech 2020, 2020, Seite(n) 3630-3634
Herausgeber: ISCA
DOI: 10.21437/interspeech.2020-1784

OPUS-MT – Building open translation services for the World

Autoren: Jörg Tiedemann, Santhosh Thottingal
Veröffentlicht in: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 2020, Seite(n) 479–480, ISBN 978-989-33-0589-8
Herausgeber: European Association for Machine Translation

PicSOM and EURECOM Experiments in TRECVID 2019

Autoren: Hector Laria Mantecon, Jorma Laaksonen, Danny Francis, Benoit Huet
Veröffentlicht in: Proceedings of TRECVID 2019, 2019
Herausgeber: NIST

INA’s MIREX 2018 music and speech detection system

Autoren: David Doukhan, Eliott Lechapt, Marc Evrard, Jean Carrive
Veröffentlicht in: 14th Music Information Retrieval Evaluation eXchange (MIREX), September 2018, Paris, France., 2018
Herausgeber: The International Music Information Retrieval Systems Evaluation Laboratory

The Tatoeba Translation Challenge – Realistic Data Sets for Low Resource and Multilingual MT

Autoren: Jörg Tiedemann
Veröffentlicht in: Proceedings of the Fifth Conference on Machine Translation, 2020, Seite(n) 1174–1182, ISBN 978-1-948087-81-0
Herausgeber: Association for Computational Linguistics

Spherediar: An Effective Speaker Diarization System for Meeting Data

Autoren: Tuomas Kaseva, Aku Rouhe, Mikko Kurimo
Veröffentlicht in: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019, Seite(n) 373-380, ISBN 978-1-7281-0306-8
Herausgeber: IEEE
DOI: 10.1109/asru46091.2019.9003967

Using Fan-Made Content, Subtitles and Face Recognition for Character-Centric Video Summarization

Autoren: Ismail Harrando, Alison Reboud, PasqualeLisena,Raphaël Troncy, Jorma Laaksonen, Anja Virkkunen, Mikko Kurimo
Veröffentlicht in: Proceedings of the TRECVID 2020 Workshop, 2020
Herausgeber: NIST

Cognate-aware morphological segmentation for multilingual neural translation

Autoren: Stig-Arne Grönroos, Sami Virpioja, Mikko Kurimo
Veröffentlicht in: Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 2018, Seite(n) 386-393, ISBN 978-1-948087-81-0
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/w18-6410

The WMT’18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English

Autoren: Franck Burlot, Yves Scherrer, Vinit Ravishankar, Ondřej Bojar, Stig-Arne Grönroos, Maarit Koponen, Tommi Nieminen, François Yvon
Veröffentlicht in: Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 2018, Seite(n) 546-560, ISBN 978-1-948087-81-0
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/w18-6433

Two-Stream Part-Based Deep Representation for Human Attribute Recognition

Autoren: Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen
Veröffentlicht in: 2018 International Conference on Biometrics (ICB), Ausgabe Proceedings - 2018 International Conference on Biometrics, ICB 2018, 2018, Seite(n) 90-97, ISBN 978-1-5386-4285-6
Herausgeber: IEEE
DOI: 10.1109/ICB2018.2018.00024

The MeMAD Submission to the IWSLT 2018 Speech Translation Task

Autoren: Sulubacak, Umut; Tiedemann, Jörg; Rouhe, Aku; Grönroos, Stig-Arne; Kurimo, Mikko
Veröffentlicht in: Proceedings of the International Workshop on Spoken Language Translation, 2018, Seite(n) 89-94
Herausgeber: IWSLT

The Aalto system based on fine-tuned AudioSet features for DCASE2018 task2 - general purpose audio tagging

Autoren: Zhicun Xu, Peter Smit, and Mikko Kurimo
Veröffentlicht in: Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018), 2018, Seite(n) 24-28, ISBN 978-952-15-4262-6
Herausgeber: Tampere University of Technology

Deep Multimodal Features for Movie Genre and Interestingness Prediction

Autoren: Olfa Ben-Ahmed, Benoit Huet
Veröffentlicht in: 2018 International Conference on Content-Based Multimedia Indexing (CBMI), 2018, Seite(n) 1-6, ISBN 978-1-5386-7021-7
Herausgeber: IEEE
DOI: 10.1109/cbmi.2018.8516504

EURECOM participation in TrecVid VTT 2018

Autoren: Danny Francis, Benoit Huet, Bernard Merialdo
Veröffentlicht in: TRECVID 2018, 22nd International Workshop on Video Retrieval Evaluation, November 13-15, 2018, Gaithersburg, USA, 2018
Herausgeber: NIST

PicSOM Experiments in TRECVID 2018

Autoren: Mats Sjöberg, Hamed R. Tavakoli, Zhicun Xu, Hector Laria Mantecon, Jorma Laaksonen
Veröffentlicht in: TRECVID 2018, 22nd International Workshop on Video Retrieval Evaluation, November 13-15, 2018, Gaithersburg, USA, 2018
Herausgeber: NIST

Morfessor EM+Prune: Improved Subword Segmentation with Expectation Maximization and Pruning

Autoren: Stig-Arne Grönroos, Sami Virpioja, Mikko Kurimo
Veröffentlicht in: Proceedings of the 12th Language Resources and Evaluation Conference, 2020, Seite(n) 3944–3953, ISBN 979-10-95546-34-4
Herausgeber: European Language Resources Association

Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models

Autoren: Wang, Tzu-Jui Julius; Pehlivan, Selen; Laaksonen, Jorma
Veröffentlicht in: Proceedings of the British Machine Vision Conference (BMVC), 2020
Herausgeber: British Machine Vision Association

EURECOM at TRECVid AVS 2019

Autoren: Danny Francis, Phuong Anh Nguyen, Benoit Huet, Chong-Wah Ngo
Veröffentlicht in: TRECVID 2019, 23rd International Workshop on Video Retrieval Evaluation, 12-13 November 2019, Gaithersburg, MD, USA, 2019
Herausgeber: NIST

Releasing a Toolkit and Comparing the Performance of Language Embeddings Across Various Spoken Language Identification Datasets

Autoren: Matias Lindgren, Tommi Jauhiainen, Mikko Kurimo
Veröffentlicht in: Interspeech 2020, 2020, Seite(n) 467-471
Herausgeber: ISCA
DOI: 10.21437/interspeech.2020-2706

The University of Helsinki Submission to the IWSLT2020 Offline SpeechTranslation Task

Autoren: Raúl Vázquez, Mikko Aulamo, Umut Sulubacak, Jörg Tiedemann
Veröffentlicht in: Proceedings of the 17th International Conference on Spoken Language Translation, 2020, Seite(n) 95-102, ISBN 978-1-952148-07-1
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/2020.iwslt-1.10

Speaker-Aware Training of Attention-Based End-to-End Speech Recognition Using Neural Speaker Embeddings

Autoren: Aku Rouhe, Tuomas Kaseva, Mikko Kurimo
Veröffentlicht in: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, Seite(n) 7064-7068, ISBN 978-1-5090-6631-5
Herausgeber: IEEE
DOI: 10.1109/icassp40776.2020.9053998

The MeMAD Submission to the WMT18 Multimodal Translation Task

Autoren: Stig-Arne Grönroos, Benoit Huet, Mikko Kurimo, Jorma Laaksonen, Bernard Merialdo, Phu Pham, Mats Sjöberg, Umut Sulubacak, Jörg Tiedemann, Raphael Troncy, Raúl Vázquez
Veröffentlicht in: Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 2018, Seite(n) 603-611, ISBN 978-1-948087-81-0
Herausgeber: Association for Computational Linguistics
DOI: 10.18653/v1/w18-6439

Explainable Zero-Shot Topic Extraction Using a Common-Sense Knowledge Graph

Autoren: Harrando, Ismail; Troncy, Raphaël
Veröffentlicht in: LDK 2021, 3rd Conference on Language, Data and Knowledge, 1-3 September 2021, 2021
Herausgeber: Dagstul Publishing

Geometry-aware Relational Exemplar Attention for Dense Captioning

Autoren: Tzu-Jui Julius Wang, Hamed R. Tavakoli, Mats Sjöberg, Jorma Laaksonen
Veröffentlicht in: 1st International Workshop on Multimodal Understanding and Learning for Embodied Applications - MULEA '19, 2019, Seite(n) 3-11, ISBN 9781450369183
Herausgeber: ACM
DOI: 10.1145/3347450.3357656

Predicting Media Memorability with Audio, Video, and Text representation

Autoren: Reboud, Alison; Harrando, Ismail; Laaksonen, Jorma; Troncy, Raphaël
Veröffentlicht in: Working Notes Proceedings of the MediaEval 2020 Workshop, Ausgabe 3, 2020
Herausgeber: CEUR

AI4TV 2020 - 2nd International Workshop on AI for Smart TV Content Production, Access and Delivery

Autoren: Raphaël Troncy, Jorma Laaksonen, Hamed R. Tavakoli, Lyndon Nixon, Vasileios Mezaris, Mohammad Hosseini
Veröffentlicht in: Proceedings of the 28th ACM International Conference on Multimedia, 2020, Seite(n) 4756-4757, ISBN 9781450379885
Herausgeber: ACM
DOI: 10.1145/3394171.3421894

Named Entity Recognition as Graph Classification

Autoren: Harrando, Ismail; Troncy, Raphaël
Veröffentlicht in: The Semantic Web - ESWC 2021, 18th Extended Semantic Web Conference, 6-10 June 2021, 2021, ISBN 978-3-030-77385-4
Herausgeber: Springer

Combining Textual and Visual Modeling for Predicting Media Memorability

Autoren: Alison Reboud, Ismail Harrando, Jorma Laaksonen, Danny Francis, Raphaël Troncy, Hector Laria Mantecon
Veröffentlicht in: CEUR Workshop Proceedings - Working Notes Proceedings of the MediaEval 2019 Workshop, Sophia Antipolis, France, 27-30 October 2019, Ausgabe 2670, 2019, ISSN 1613-0073
Herausgeber: CEUR Workshop Proceedings

Advances in subword-based HMM-DNN speech recognition across languages

Autoren: Peter Smit, Sami Virpioja, Mikko Kurimo
Veröffentlicht in: Computer Speech & Language, Ausgabe 66, 2021, Seite(n) 101158, ISSN 0885-2308
Herausgeber: Academic Press
DOI: 10.1016/j.csl.2020.101158

Multimodal machine translation through visuals and speech

Autoren: Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann
Veröffentlicht in: Machine Translation, Ausgabe 34/2-3, 2020, Seite(n) 97-147, ISSN 0922-6567
Herausgeber: Kluwer Academic Publishers
DOI: 10.1007/s10590-020-09250-0

Taking a Cue From the Human

Autoren: Kim Linda Starr, Sabine Braun, Jaleh Delfani
Veröffentlicht in: Journal of Audiovisual Translation, Ausgabe 3/2, 2020, ISSN 2617-9148
Herausgeber: European Association for Studies in Screen Translation
DOI: 10.47476/jat.v3i2.2020.138

Transfer learning and subword sampling for asymmetric-resource one-to-many neural translation

Autoren: Stig-Arne Grönroos, Sami Virpioja, Mikko Kurimo
Veröffentlicht in: Machine Translation, Ausgabe 34/4, 2020, Seite(n) 251-286, ISSN 0922-6567
Herausgeber: Kluwer Academic Publishers
DOI: 10.1007/s10590-020-09253-x

User perspectives on developing technology-assisted access services in public broadcasting

Autoren: Maarit Koponen, Tiina Tuominen, Maija Hirvonen, Kaisa Vitikainen, Liisa Tiittula
Veröffentlicht in: Bridge: Trends and Traditions in Translation and Interpreting Studies, Ausgabe 2, 2021, Seite(n) 47-67, ISSN 2729-8183
Herausgeber: Nitra: Department of Translation Studies Faculty of Arts Constantine the Philosopher University in Nitra

Finding the Right Words

Autoren: Sabine Braun, Kim Starr
Veröffentlicht in: Journal of Audiovisual Translation, Ausgabe 2/2, 2019, Seite(n) 11-35, ISSN 2617-9148
Herausgeber: European Association for Studies in Screen Translation
DOI: 10.47476/jat.v2i2.103

MediaEval 2018: Predicting Media Memorability

Autoren: Cohendet, Romain; Demarty, Claire-Hélène; Duong, Ngoc Q.K.; Sjöberg, Mats; Ionescu, Bogdan; Do, Thanh Toan
Veröffentlicht in: CEUR Workshop Proceedings, Ausgabe 2283, 2018, ISSN 1613-0073
Herausgeber: CEUR

FaceRec: An Interactive Framework for Face Recognition in Video Archives

Autoren: Pasquale Lisena; Jorma Laaksonen; Raphael Troncy
Veröffentlicht in: Ceur Workshop Proceedings, 2021, ISSN 1613-0073
Herausgeber: CEUR Workshop Proceedings
DOI: 10.5281/zenodo.4764633

Transdisciplinary Analysis of a Corpus of French Newsreels: The ANTRACT Project

Autoren: Carrive, Jean; Beloued, Abdelkrim; Goetschel, Pascale; Heiden, Serge; Laurent, Antoine; Lisena, Pasquale; Mazuet, Franck; Meignier, Sylvain; Pincemin, Bénédicte; Poels, Géraldine; Troncy, Raphaël
Veröffentlicht in: Digital Humanities Quarterly, Ausgabe 15 (1), 2021, ISSN 1938-4122
Herausgeber: Alliance of Digital Humanities Organizations

Effective video hyperlinking by means of enriched feature sets and monomodal query combinations

Autoren: Mohammad Reza Kavoosifar, Daniele Apiletti, Elena Baralis, Paolo Garza, Benoit Huet
Veröffentlicht in: International Journal of Multimedia Information Retrieval, Ausgabe 9/3, 2020, Seite(n) 215-227, ISSN 2192-6611
Herausgeber: Springer
DOI: 10.1007/s13735-019-00173-y

Machine translation and fair access to information

Autoren: Mary Nurminen, Maarit Koponen
Veröffentlicht in: Translation Spaces, Ausgabe 9/1, 2020, Seite(n) 150-169, ISSN 2211-3711
Herausgeber: John Benjamins Publishing
DOI: 10.1075/ts.00025.nur

Describing Gender Equality in French Audiovisual Streams with a Deep Learning Approach

Autoren: David Doukhan, Géraldine Poels, Zohra Rezgui, Jean Carrive
Veröffentlicht in: VIEW Journal of European Television History and Culture, Ausgabe 7/14, 2019, Seite(n) 103, ISSN 2213-0969
Herausgeber: Netherlands Institute for Sound and Vision
DOI: 10.18146/2213-0969.2018.jethc156

ADEL: ADaptable Entity Linking : A hybrid approach to link entities with linked data for information extraction

Autoren: Julien Plu, Giuseppe Rizzo, Raphaël Troncy
Veröffentlicht in: Semantic Web Journal, 2019, ISSN 1570-0844
Herausgeber: IOS Press

Paragraph-length image captioning using hierarchical recurrent neural networks

Autoren: Arturs Polis
Veröffentlicht in: Master's thesis, 2019
Herausgeber: University of Helsinki

Spherediar – an efficient speaker diarization system for meeting data

Autoren: Tuomas Kaseva
Veröffentlicht in: Master's thesis, 2019
Herausgeber: Aalto University

Visual Storytelling: Captioning of Image Sequences

Autoren: Aditya Surikuchi
Veröffentlicht in: Master's thesis, 2019
Herausgeber: Aalto University

Audio Event Classification Using Deep Learning Methods

Autoren: Zhicun Xu
Veröffentlicht in: Master's thesis, 2018
Herausgeber: Aalto University

Deep Reinforcement Sequence Learning for Visual Captioning

Autoren: Héctor Laria Mantecón
Veröffentlicht in: Master's thesis, 2019
Herausgeber: Aalto University

Semantic representations of images and videos

Autoren: Danny Francis
Veröffentlicht in: 2019
Herausgeber: Eurecom

VIREO @ Video Browser Showdown 2020

Autoren: Phuong Anh Nguyen, Jiaxin Wu, Chong-Wah Ngo, Danny Francis, Benoit Huet
Veröffentlicht in: MultiMedia Modeling - 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II, Ausgabe 11962, 2020, Seite(n) 772-777, ISBN 978-3-030-37733-5
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-37734-2_68

Comparing human and automated approaches to visual storytelling

Autoren: Sabine Braun, Kim Starr, Jorma Laaksonen
Veröffentlicht in: Innovation in Audio Description Research, 2020, Seite(n) 159-196, ISBN 9781003052968
Herausgeber: Routledge
DOI: 10.4324/9781003052968-9

Introduction: Mapping new horizons in audio description research

Autoren: Kim Starr, Sabine Braun
Veröffentlicht in: Innovation in Audio Description Research, 2020, Seite(n) 1-13, ISBN 9781003052968
Herausgeber: Taylor and Francis Ltd.

Easy Web API Development with SPARQL Transformer

Autoren: Pasquale Lisena, Albert Meroño-Peñuela, Tobias Kuhn, Raphaël Troncy
Veröffentlicht in: The Semantic Web – ISWC 2019 - 18th International Semantic Web Conference, Auckland, New Zealand, October 26–30, 2019, Proceedings, Part II, Ausgabe 11779, 2019, Seite(n) 454-470, ISBN 978-3-030-30795-0
Herausgeber: Springer
DOI: 10.1007/978-3-030-30796-7_28

Audio description 2.0: Re-versioning audiovisual accessibility to assist emotion recognition

Autoren: Sabine Braun, Kim Starr
Veröffentlicht in: Innovation in Audio Description Research, 2020, Seite(n) 97-120, ISBN 9781003052968
Herausgeber: Taylor and Francis Ltd.

A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network

Autoren: Lorenzo Canale, Pasquale Lisena, Raphaël Troncy
Veröffentlicht in: The Semantic Web – ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA, USA, October 8–12, 2018, Proceedings, Part I, Ausgabe 11136, 2018, Seite(n) 91-107, ISBN 978-3-030-00670-9
Herausgeber: Springer International Publishing
DOI: 10.1007/978-3-030-00671-6_6

Multi-stream Convolutional Networks for Indoor Scene Recognition

Autoren: Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Nazar Zaki
Veröffentlicht in: Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Salerno, Italy, September 3–5, 2019, Proceedings, Part I, Ausgabe 11678, 2019, Seite(n) 196-208, ISBN 978-3-030-29887-6
Herausgeber: Springer
DOI: 10.1007/978-3-030-29888-3_16

Big Data Analytics for Large‐Scale Multimedia Search

Autoren: Stefanos Vrochidis, Benoit Huet,Edward Chang, Ioannis Kompatsiaris
Veröffentlicht in: 2019, ISBN 9781119376996
Herausgeber: Wiley
DOI: 10.1002/9781119376996

Innovation in Audio Description Research

Autoren: Sabine Braun, Kim Starr
Veröffentlicht in: 2020, ISBN 9781003052968
Herausgeber: Taylor & Francis Ltd
DOI: 10.4324/9781003052968

Détection et classification de visages pour la description de l’égalité femme-homme dans les archives télévisuelles

Autoren: Zohra Rezgui
Veröffentlicht in: 2019
Herausgeber: University of Carthage
DOI: 10.13140/rg.2.2.25957.76005

Suche nach OpenAIRE-Daten ...

Leistungen

Veröffentlichungen

Diese Seite teilen

Herunterladen