Skip to main content

Real time network, text, and speaker analytics for combating organized crime

Deliverables

Preliminary report on network analysis

Initial report and system on NA.

Initial speech/text/video technologies

A set of software and associated report for rapid deployment of speech, NLP and video technologies for early integration and system testing.

Description of the integration toolkit, guidelines, plan

Preliminary ROXANNE integration platform for 1st field-test developed. Technical description of the tools to be developed to ease the integration of the different components. A set of guidelines described.

Technical specifications and detailed architecture

Technical specifications and detailed architecture report: Technical specifications and the design architecture of the platform and integration framework given the requirements.

The project's communications plan update M18

Updated version at M18 of the project's communications plan.

The project's dissemination and exploitation plan

Interim version at M5, with updated versions at M18 and M36. The M36 version will be prepared for those partners exploiting the project results and for stakeholders using the results after EU funding ends.

Overview and analysis of lawfully intercepted data

Overview and analysis of lawfully intercepted and publicly available data: Report providing initial overview of investigation and public data available for ROXANNE (+legal framework).

Risk Assessment

Risk assessment of the whole project.

Initial report on compliance with ethical principles

Initial report (checklist brochure) on compliance with ethical/societal/fundamental/privacy principles: Summary of initial results of T3.1-T3.4. Security advisory board introduced, update M36 in D3.4.

The project's communications plan

Interim version of this plan at M4, with updated versions at M18 and M36.

Development of a decision-making-mechanism

Development of a decision-making-mechanism for ensuring compliance: Report on how to set up and operate the mechanism, with a list of questions that can be used.

Training manual Volume I

Development of the online dynamic manual, regarding the integrated solution accessed by all end-users (updated at M19, M29).

Creation of the project's identity and website

Creation of the project’s identity, website and online accounts: to communicate, inform, create dialogue and promote use of the project results. It includes the project’s online accounts.

Publications

German News Article Classification : A Multichannel CNN Approach

Author(s): Parida, Shantipriya; Motlicek, Petr; Dash, Satya Ranjan
Published in: Proceeding 2nd International Conference on Emerging Trends and Advances in Electrical Engineering and Renewable Energy (ETAEERE-2020), 2020

ODIANLP’s Participation in WAT2020

Author(s): Shantipriya Parida, Petr Motlicek, Amulya Ratna Dash, Satya Ranjan Dash, Debasish Kumar Mallick, Satya Prakash Biswal, Priyanka Pattnaik, Biranchi Narayan Nayak, Ondřej Bojar
Published in: Proceedings of the 7th Workshop on Asian Translation, 2020, Page(s) 103–108

Idiap NMT System for WAT 2019 Multi-Modal Translation Task

Author(s): Shantipriya Parida, Petr Motlíček, Ondřej Bojar
Published in: Proceedings of the 6th Workshop on Asian Translation, 2019, Page(s) 175-180

Detection of Similar Languages and Dialects Using Deep Supervised Autoencoders

Author(s): Shantipriya Parida; Esau Villatoro-Tello; Sajit Kumar; Mael Fabien; Petr Motlicek
Published in: Proceedings of the 17th International Conference on Natural Language Processing, 2020

OdiEnCorp 2.0: Odia-English Parallel Corpus for Machine Translation

Author(s): Shantipriya Parida, Satya Ranjan Dash, Ondřej Bojar, Petr Motlicek, Priyanka Pattnaik, Debasish Kumar Mallick
Published in: Proceedings of the WILDRE5– 5th Workshop on Indian Language Data: Resources and Evaluation, 2020

On Node Embedding of Uncertain Networks

Author(s): Hoang H. Nguyen, Sergej Zerr, Tuan-Anh Hoang
Published in: 2020 IEEE International Conference on Big Data (Big Data), 2020, Page(s) 5792-5794
DOI: 10.1109/bigdata50022.2020.9378022

Idiap Submission to Swiss-German Language Detection Shared Task

Author(s): Shantipriya Parida, Esaú Villatoro-Tello, Sajit Kumar, Petr Motlicek, Qingran Zhan
Published in: Proceedings of the 5th Swiss Text Analytics Conference (SwissText) & 16th Conference on Natural Language Processing (KONVENS), 2020

Idiap & UAM participation at GermEval 2020: Classification and Regression of Cognitive and Motivational Style from Text

Author(s): Esáu Villatoro-Tello, Shantipriya Parida, Sajit Kumar, Petr Motlicek, and Qingran Zhan
Published in: Proceedings of GermEval Task 1 (“Classification and Regression of Cognitive and Motivational Style from Text”), 2020

Idiap and UAM Participation at MEX-A3T Evaluation Campaign

Author(s): Esaú Villatoro-Tello, Gabriela Ramírez-de-la-Rosa, Sajit Kumar, Shantipriya Parida, Petr Motlicek
Published in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020), 2020, Page(s) 252-257

Distant Supervision and Noisy Label Learning for Low Resource Named Entity Recognition: A Study on Hausa and Yor\`ub\'a

Author(s): Adelani, David Ifeoluwa; Hedderich, Michael A.; Zhu, Dawei; Berg, Esther van den; Klakow, Dietrich
Published in: ICLR 2020 Workshop, Issue 3, 2020

Incremental Semi-Supervised Learning for Multi-Genre Speech Recognition

Author(s): Banriskhem Khonglah, Srikanth Madikeri, Subhadeep Dey, Herve Bourlard, Petr Motlicek, Jayadev Billa
Published in: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, Page(s) 7419-7423
DOI: 10.1109/icassp40776.2020.9054309

Analysis of the BUT Diarization System for VoxConverse Challenge

Author(s): Landini, Federico; Glembek, Ondřej; Matějka, Pavel; Rohdin, Johan; Burget, Lukáš; Diez, Mireia; Silnova, Anna
Published in: ICASSP 2021, Issue 3, 2021

BertAA: BERT fine-tuning for Authorship Attribution

Author(s): Fabien, Mael; VILLATORO-TELLO, Esaú; Motlicek, Petr; Parida, Shantipriya
Published in: Proceedings of the 17th International Conference on Natural Language Processing, 2020

Graph2Speak: Improving Speaker Identification using Network Knowledge in Criminal Conversational Data

Author(s): Fabien, Mael; Sarfjoo, Seyyed Saeed; Motlicek, Petr; Madikeri, Srikanth
Published in: ICASSP 2021, Issue 1, 2021

Inferring Highly-dense Representations for Clustering Broadcast Media Content

Author(s): Esaú Villatoro-Tello, Shantipriya Parida, Petr Motlicek, Ondřej Bojar
Published in: Prague Bulletin of Mathematical Linguistics, Issue 115/1, 2020, Page(s) 31-50, ISSN 1804-0462
DOI: 10.14712/00326585.004

Robust link prediction in criminal networks: A case study of the Sicilian Mafia

Author(s): Francesco Calderoni, Salvatore Catanese, Pasquale De Meo, Annamaria Ficara, Giacomo Fiumara
Published in: Expert Systems with Applications, Issue 161, 2020, Page(s) 113666, ISSN 0957-4174
DOI: 10.1016/j.eswa.2020.113666

Establishing phone-pair co-usage by comparing mobility patterns

Author(s): Wauter Bosma, Sander Dalm, Erwin van Eijk, Rachid el Harchaoui, Edwin Rijgersberg, Hannah Tereza Tops, Alle Veenstra, Rolf Ypma
Published in: Science & Justice, Issue 60/2, 2020, Page(s) 180-190, ISSN 1355-0306
DOI: 10.1016/j.scijus.2019.10.005

Experimental Evaluation of Scale, and Patterns of Systematic Inconsistencies in Google Trends Data

Author(s): Philipp Behnen, Rene Kessler, Felix Kruse, Jorge Marx Gómez, Jan Schoenmakers, Sergej Zerr
Published in: ECML PKDD 2020 Workshops - Workshops of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020): SoGood 2020, PDFL 2020, MLCS 2020, NFMCP 2020, DINA 2020, EDML 2020, XKDD 2020 and INRA 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Issue 1323, 2020, Page(s) 374-384
DOI: 10.1007/978-3-030-65965-3_25

Analysing the Noise Model Error for Realistic Noisy Label Data

Author(s): Hedderich, Michael A.; Zhu, Dawei; Klakow, Dietrich
Published in: Issue 1, 2021

BUT System Description for The Third DIHARD Speech Diarization Challenge

Author(s): Federico Landini, Alicia Lozano-Diez, Lukas Burget, Mireia Diez, Anna Silnova, Katerina Zmolıkova, Ondrej Glembek, Pavel Matejka, Themos Stafylakis, Niko Brümmer
Published in: Proceedings available at Dihard Challenge Github, 2021

Speech Activity Detection Based on Multilingual Speech Recognition System

Author(s): Sarfjoo, Seyyed Saeed; Madikeri, Srikanth; Motlicek, Petr
Published in: Issue 1, 2021

Datasets

Hindi Visual Genome 1.1

Author(s): Parida, Shantipriya; Bojar, Ondřej
Published in: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)