Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Exchanges for SPEech ReseArch aNd TechnOlogies

CORDIS proporciona enlaces a los documentos públicos y las publicaciones de los proyectos de los programas marco HORIZONTE.

Los enlaces a los documentos y las publicaciones de los proyectos del Séptimo Programa Marco, así como los enlaces a algunos tipos de resultados específicos, como conjuntos de datos y «software», se obtienen dinámicamente de OpenAIRE .

Resultado final

Communication and Dissemination Plan (se abrirá en una nueva ventana)

Organise communication activities, workshops, courses and other knowledge share actions. Ensure that the dissemination is compliant with open access requirements.

Guidance for evaluation of explainability in speech (se abrirá en una nueva ventana)

Description of protocols, metrics and scenarios to evaluate explainability of speech algorithms for different speech processing tasks.

Description of explainability for speech (se abrirá en una nueva ventana)

This deliverable will provide a list and description of criteria for explainability in the context of speech processing

Scientific dissemination guidelines (se abrirá en una nueva ventana)

Identify partners specificities and needs Writing guidelines to the attention of partners to ensure an active and coherent communication activity among countries

Solutions for corpus augmentation (se abrirá en una nueva ventana)

Esperanto partners will study various methods to make use of limited corpora and artificially or automatically extend those corpora.This deliverable will report methods and performance of various approaches that can generalize well from only a few examples and can leverage diverse types of knowledge, such as typological information, annotated data from other languages or domains, unlabelled data, or multimodal data.

Guidance for evaluation of human assisted learning (se abrirá en una nueva ventana)

Description of protocols, metrics and scenarios designed to evaluate different modes of human assisted learning (active learning, interactive learning) for different speech tasks.

Project website and visual identity (se abrirá en una nueva ventana)

Setting up a visual identity and a website presenting to the general public the objectives of the project, its partners and the main tools employed. The website will also be a resource for the partners for all that concerns good practices.

Corpora for under-resource task (se abrirá en una nueva ventana)

Esperanto will support the collection or extension of several corpora for under-resourced tasks such as a corpus for pronunciation evaluation and a corpus of pathological speech.

Corpora for under-resourced languages (se abrirá en una nueva ventana)

ESPERANTO will create and extend corpora for under-resourced languages such as African languages including Ewondo, Féfé, Fufuldé, as well as Arabic Tunisian dialect.

Methods for explainability by design (SDK) (se abrirá en una nueva ventana)

Implementation of various architecture designed for explainability. This SDK will include various modules designed to be shared amongst speech processing tasks to highlight the part of the incoming data that leads to the resulting decision and identify and characterize the bias induced in the system.

SDK for human assisted learning speech processing (se abrirá en una nueva ventana)

Software library developed in open-source licence to tackle to issue of human assisted learning for various tasks of speech processing such as speaker diarization, speaker verification, speech translation of speech recognition.

Models that learn from small data (se abrirá en una nueva ventana)

To deal with under-resource task, this deliverable will implement different approaches dealing with low or zero resource learning such as transfer learning and modeling based on the same information shared between languages or tasks, approaches that do not need annotated data, use of expert knowledge in empirical systems, systems using multimodal data for semantic supervision.

Publicaciones

Cross-Corpus Speech Emotion Recognition with HuBERT Self-Supervised Representation (se abrirá en una nueva ventana)

Autores: Miguel Pastor; Dayana Ribas; Alfonso Ortega; Antonio Miguel; Eduardo Lleida
Publicado en: Proc. IberSPEECH, Edición 12, 2022
Editor: ISCA
DOI: 10.21437/iberspeech.2022-16

Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer (se abrirá en una nueva ventana)

Autores: Mošner Ladislav, Plchot Oldřich, Burget Lukáš, Černocký Jan
Publicado en: Proceedings of ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, Página(s) 7982-7986, ISBN 978-1-6654-0540-9
Editor: IEEE Signal Processing Society
DOI: 10.1109/icassp43922.2022.9747771

A transfer learning based approach for pronunciation scoring (se abrirá en una nueva ventana)

Autores: Sancinetti, Marcelo; Vidal, Jazmin; Bonomi, Cyntia; Ferrer, Luciana
Publicado en: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Edición 4, 2022, Página(s) 6812-6816, ISBN 978-1-6654-0540-9
Editor: IEEE
DOI: 10.48550/arxiv.2111.00976

Microphone Array Channel Combination Algorithms for Overlapped Speech Detection (se abrirá en una nueva ventana)

Autores: Theo Mariotte; Anthony Larcher; Silvio Montrésor; Jean-Hugh Thomas
Publicado en: Interspeech 2022 Human and Humanizing Speech Technology, Edición 12, 2022
Editor: ISCA
DOI: 10.21437/interspeech.2022-10758

Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems (se abrirá en una nueva ventana)

Autores: Victoria Mingote, Antonio Miguel, Alfonso Ortega, Eduardo Lleida
Publicado en: Interspeech 2021, 2021, Página(s) 2361-2365
Editor: ISCA
DOI: 10.21437/interspeech.2021-1085

Extracting Speaker and Emotion Information from Self-Supervised Speech Models via Channel-Wise Correlations (se abrirá en una nueva ventana)

Autores: Themos Stafylakis; Ladislav Mosner; Sofoklis Kakouros; Oldrich Plchot; Lukas Burget; Jan Cernocky
Publicado en: 2022 IEEE Spoken Language Technology Workshop (SLT), Edición 22, 2023, ISBN 979-8-3503-9690-4
Editor: IEEE
DOI: 10.1109/slt54892.2023.10023345

Learnable Sparse Filterbank for Speaker Verification (se abrirá en una nueva ventana)

Autores: PENG Junyi, GU Rongzhi, MOŠNER Ladislav, PLCHOT Oldřich, BURGET Lukáš and ČERNOCKÝ Jan.
Publicado en: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Edición "ISSN = ""1990-9772""", 2022, Página(s) 5110-5114
Editor: International Speech Communication Association (ISCA)
DOI: 10.21437/interspeech.2022-11309

On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding (se abrirá en una nueva ventana)

Autores: Gaelle Laperriere; Valentin Pelloin; Mickael Rouvier; Themos Stafylakis; Yannick Esteve
Publicado en: 2022 IEEE Spoken Language Technology Workshop (SLT), Edición 24, 2023
Editor: IEEE
DOI: 10.1109/slt54892.2023.10023013

Multisv: Dataset for Far-Field Multi-Channel Speaker Verification (se abrirá en una nueva ventana)

Autores: Mošner Ladislav, Plchot Oldřich, Burget Lukáš, Černocký Jan
Publicado en: Proceedings of ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, Página(s) 7977-7981, ISBN 978-1-6654-0540-9
Editor: IEEE Signal Processing Society
DOI: 10.1109/icassp43922.2022.9746833

Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings. (se abrirá en una nueva ventana)

Autores: BRUMMER Johan Nikolaas Langenhoven, SWART Albert du Preez, MOŠNER Ladislav, SILNOVA Anna, PLCHOT Oldřich, STAFYLAKIS Themos and BURGET Lukáš.
Publicado en: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH., Edición "ISSN = ""1990-9772""", 2022, Página(s) 1446-1450
Editor: International Speech Communication Association (ISCA)
DOI: 10.21437/interspeech.2022-731

Speaker Embeddings for Diarization of Broadcast Data In The Allies Challenge (se abrirá en una nueva ventana)

Autores: Anthony Larcher; Ambuj Mehrish; Marie Tahon; Sylvain Meignier; Jean Carrive; David Doukhan; Olivier Galibert; Nicholas Evans
Publicado en: ICASSP, Edición 1, 2021
Editor: IEEE
DOI: 10.1109/icassp39728.2021.9414215

Semantic Enrichment Towards Efficient Speech Representations (se abrirá en una nueva ventana)

Autores: Gaëlle Laperrière; Ha Nguyen; Sahar Ghannay; Bassam Jabaian; Yannick Estève
Publicado en: Proc. INTERSPEECH 2023, 2023, Página(s) 705-709, ISSN 1990-9772
Editor: ISCA
DOI: 10.21437/interspeech.2023-2234

Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing (se abrirá en una nueva ventana)

Autores: Sofoklis Kakouros; Themos Stafylakis; Ladislav Mošner; Lukáš Burget
Publicado en: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Edición 19, 2023, Página(s) pp. 1-5, ISBN 978-1-7281-6327-7
Editor: IEEE
DOI: 10.1109/icassp49357.2023.10094673

Multi-Channel Speech Separation with Cross-Attention and Beamforming (se abrirá en una nueva ventana)

Autores: "Ladislav Mosner, Oldřich Plchot, Junyi Peng, Lukáš Burget, Jan ""Honza"" Černocký"
Publicado en: Proc. INTERSPEECH 2023, 2023, Página(s) 693-1697, ISSN 1990-9772
Editor: International Speech Communication Association
DOI: 10.21437/interspeech.2023-2537

Speaker Embeddings by Modeling Channel-Wise Correlations (se abrirá en una nueva ventana)

Autores: Themos Stafylakis, Johan Rohdin, Lukáš Burget
Publicado en: Interspeech 2021, 2021, Página(s) 501-505
Editor: ISCA
DOI: 10.21437/interspeech.2021-1442

A Study on the Use of wav2vec Representations for Multiclass Audio Segmentation (se abrirá en una nueva ventana)

Autores: Pablo Gimeno; Alfonso Ortega; Antonio Miguel; Eduardo Lleida
Publicado en: Proc. IberSPEECH2022, Edición 8, 2022, Página(s) 56-60
Editor: ISCA
DOI: 10.21437/iberspeech.2022-12

End-to-End Speech Translation of Arabic to English Broadcast News (se abrirá en una nueva ventana)

Autores: Fethi Bougares; Salim Jouili
Publicado en: WANLP@ACL 2022: Abu Dhabi, United Arab Emirates, Edición 5, 2022, Página(s) 312–319, ISBN 978-1-959429-27-2
Editor: Association for Computational Linguistics
DOI: 10.18653/v1/2022.wanlp-1.29

A dual task learning approach to fine-tune a multilingual semantic speech encoder for Spoken Language Understanding (se abrirá en una nueva ventana)

Autores: Gaëlle Laperrière, Sahar Ghannay, Bassam Jabaian, Yannick Estève
Publicado en: Interspeech 2024, 2024, Página(s) 812-816
Editor: ISCA
DOI: 10.21437/interspeech.2024-1133

Improving Speaker Verification with Self-Pretrained Transformer Models (se abrirá en una nueva ventana)

Autores: "Junyi Peng, Oldřich Plchot, Themos Stafylakis, Ladislav Mosner, Lukáš Burget, Jan ""Honza"" Černocký"
Publicado en: Proc. INTERSPEECH 2023, 2023, Página(s) 5361-5365, ISSN 1990-9772
Editor: International Speech Communication Association
DOI: 10.21437/interspeech.2023-453

Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundaries. (se abrirá en una nueva ventana)

Autores: STAFYLAKIS Themos, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., SILNOVA Anna, BURGET Lukáš and ČERNOCKÝ Jan.
Publicado en: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Edición "ISSN = ""1990-9772""", 2022, Página(s) 605-609
Editor: International Speech Communication Association (ISCA)
DOI: 10.21437/interspeech.2022-10165

Improving Speaker Diarization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus (se abrirá en una nueva ventana)

Autores: Mohd Zulhafiz Rahim; Sarah Samson Juan; Fitri Suraya Mohamad
Publicado en: 2023 International Conference on Asian Language Processing (IALP), 2023, Página(s) 228-233
Editor: IEEE
DOI: 10.1109/ialp61005.2023.10337314

Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization (se abrirá en una nueva ventana)

Autores: Federico Landini; Mireia Diez; Alicia Lozano-Diez; Lukáš Burget
Publicado en: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Edición 6, 2023, Página(s) 1-5, ISBN 978-1-7281-6327-7
Editor: IEEE
DOI: 10.1109/icassp49357.2023.10097049

Description and Analysis of ABC Submission to NIST LRE 2022 (se abrirá en una nueva ventana)

Autores: Pavel Matejka, Anna Silnova, Josef Slavíček, Ladislav Mosner, Oldřich Plchot, Michal Klčo, Junyi Peng, Themos Stafylakis, Lukáš Burget
Publicado en: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2023, Página(s) 511-515, ISSN 1990-9772
Editor: International Speech Communication Association
DOI: 10.21437/interspeech.2023-1529

An Attention-Based Backend Allowing Efficient Fine-Tuning of Transformer Models for Speaker Verification (se abrirá en una nueva ventana)

Autores: Junyi Peng; Oldrich Plchot; Themos Stafylakis; Ladislav Mosner; Lukas Burget; Jan Cernocky
Publicado en: 2022 IEEE Spoken Language Technology Workshop (SLT), Edición 5, 2023, Página(s) 555-562, ISBN 979-8-3503-9690-4
Editor: IEEE
DOI: 10.1109/slt54892.2023.10022775

Active Correction for Incremental Speaker Diarization of a Collection with Human in the Loop (se abrirá en una nueva ventana)

Autores: Larcher, Yevhenii Prokopalo; Meysam Shamsi; Loïc Barrault; Sylvain Meignier; Anthony
Publicado en: Applied Sciences; Volume 12; Edición 4;, Edición 20, 2022, Página(s) Pages: 1782, ISSN 2076-3417
Editor: MDPI
DOI: 10.3390/app12041782

Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021 (se abrirá en una nueva ventana)

Autores: Pablo Gimeno, Alfonso Ortega, Antonio Miguel, Eduardo Lleida
Publicado en: Interspeech 2021, 2021, Página(s) 4359-4363
Editor: ISCA
DOI: 10.21437/interspeech.2021-309

An Explainable Proxy Model for Multilabel Audio Segmentation (se abrirá en una nueva ventana)

Autores: Mariotte, Théo; Almudévar, Antonio; Tahon, Marie; Ortega, Alfonso
Publicado en: International Conference on Acoustics Speech and Signal Processing, IEEE, Apr 2024, Seoul (Korea), Edición 5, 2024
Editor: IEEE Signal Processing Society
DOI: 10.1109/icassp48485.2024.10446648

Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch (se abrirá en una nueva ventana)

Autores: SILNOVA Anna, STAFYLAKIS Themos, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., MATĚJKA Pavel, BURGET Lukáš, GLEMBEK Ondřej a BRUMMER Johan Nikolaas Langenhoven.
Publicado en: Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022), 2022, Página(s) 9-16
Editor: International Speech Communication Association (ISCA)
DOI: 10.21437/odyssey.2022-2

Toroidal Probabilistic Spherical Discriminant Analysis (se abrirá en una nueva ventana)

Autores: Anna Silnova; Niko Brümmer; Albert Swart; Lukáš Burget
Publicado en: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Edición 2, 2023, Página(s) 1-5, ISBN 978-1-7281-6327-7
Editor: IEEE
DOI: 10.1109/icassp49357.2023.10095580

Cross-Lingual Transfer Learning for Low-Resource Speech Translation (se abrirá en una nueva ventana)

Autores: Sameer Khurana, Nauman Dawalatabad, Antoine Laurent, Luis Vicente, Pablo Gimeno, Victoria Mingote, James Glass
Publicado en: 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), 2024, Página(s) 670-674
Editor: IEEE
DOI: 10.1109/icasspw62465.2024.10626683

Sur la vérification du locuteur à partir de traces d’exécution de modèles acoustiques personnalisés (se abrirá en una nueva ventana)

Autores: Tomashenko, Natalia; Mdhaffar, Salima; Tommasi, Marc; Estève, Yannick; Bonastre, Jean-François
Publicado en: Journées d'Études sur la Parole - JEP2022, Jun 2022, Île de Noirmoutier, France, Edición 13, 2022
Editor: ISCA
DOI: 10.21437/jep.2022-91

ON-TRAC Consortium Systems for the IWSLT 2023 Dialectal and Low-resource Speech Translation Tasks (se abrirá en una nueva ventana)

Autores: Antoine Laurent, Souhir Gahbiche, Ha Nguyen, Haroun Elleuch, Fethi Bougares, Antoine Thiol, Hugo Riguidel, Salima Mdhaffar, Gaëlle Laperrière, Lucas Maison, Sameer Khurana, Yannick Estève
Publicado en: Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), 2023, Página(s) 219–226, ISBN 2-9517408-4-0
Editor: Association for Computational Linguistics
DOI: 10.18653/v1/2023.iwslt-1.18

Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models (se abrirá en una nueva ventana)

Autores: Santosh Kesiraju; Marek Sarvaš; Tomáš Pavlíček; Cécile Macaire; Alejandro Ciuba
Publicado en: Proc. INTERSPEECH 2023, Edición 14, 2023, Página(s) 2148--2152, ISSN 1990-9772
Editor: ISCA
DOI: 10.21437/interspeech.2023-2506

Discriminative Training of VBx Diarization (se abrirá en una nueva ventana)

Autores: Klement, Dominik; Diez, Mireia; Landini, Federico; Burget, Lukáš; Silnova, Anna; Delcroix, Marc; Tawara, Naohiro
Publicado en: Crossref, Edición 5, 2024
Editor: IEEE
DOI: 10.1109/icassp48485.2024.10446119

BUT CHiME-7 system description (se abrirá en una nueva ventana)

Autores: Karafiát, Martin; Veselý, Karel; Szöke, Igor; Mošner, Ladislav; Beneš, Karel; Witkowski, Marcin; Barchi, Germán; Pepino, Leonardo
Publicado en: CHiME-7 proceedings, Edición 8, 2023
Editor: arxiv
DOI: 10.48550/arxiv.2310.11921

Development of ABC systems for the 2021 edition of NIST Speaker Recognition Evaluation (se abrirá en una nueva ventana)

Autores: ALAM Jahangir, BURGET Lukáš, GLEMBEK Ondřej, MATĚJKA Pavel, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., SILNOVA Anna and STAFYLAKIS Themos et al.
Publicado en: Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022), 2022, Página(s) 346-353
Editor: International Speech Communication Association (ISCA)
DOI: 10.21437/odyssey.2022-48

Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters (se abrirá en una nueva ventana)

Autores: Junyi Peng; Themos Stafylakis; Rongzhi Gu; Oldřich Plchot; Ladislav Mošner; Lukáš Burget; Jan Černocký
Publicado en: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Edición 6, 2023, Página(s) 1-5, ISBN 978-1-7281-6327-7
Editor: IEEE
DOI: 10.1109/icassp49357.2023.10094795

Automatic Speech Interruption Detection: Analysis, Corpus, and System

Autores: Lebourdais, Martin; Tahon, Marie; Laurent, Antoine; Meignier, Sylvain
Publicado en: Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-Coling 2024), Edición 21, 2024
Editor: ELRA and ICCL

Generalizing AUC Optimization to Multiclass Classification for Audio Segmentation With Limited Training Data (se abrirá en una nueva ventana)

Autores: Pablo Gimeno; Victoria Mingote; Alfonso Ortega; Antonio Miguel; Eduardo Lleida
Publicado en: IEEE Signal Processing Letters, Edición 4, 2021, Página(s) 1135 - 1139, ISSN 1070-9908
Editor: Institute of Electrical and Electronics Engineers
DOI: 10.1109/lsp.2021.3084501

YembaTones: a syllable-tone annotated dataset for speech recognition and prosodic analysis of the Yemba language. (se abrirá en una nueva ventana)

Autores: Kenfack Jeuguim Marc Sturm; Paulin Melatagia Yonta; Sandembouo Etienne
Publicado en: Data in Brief, Edición 11, 2023, ISSN 2352-3409
Editor: Elsevier BV
DOI: 10.1016/j.dib.2023.109860

Towards Lifelong Human Assisted Speaker Diarization (se abrirá en una nueva ventana)

Autores: Meysam Shamsi; Anthony Larcher; Loic Barrault; Sylvain Meignier; Yevheni Prokopalo; Marie Tahon; Ambuj Mehrish; Simon Petitrenaud; Olivier Galibert; Samuel Gaist; André Anjos; Sebastien Marcel; Marta R. Costa-jussà
Publicado en: Computer Speech & Language, Edición 4, 2023, ISSN 0885-2308
Editor: Academic Press
DOI: 10.1016/j.csl.2022.101437

aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems (se abrirá en una nueva ventana)

Autores: Victoria Mingote; Antonio Miguel; Dayana Ribas; Alfonso Ortega; Eduardo Lleida
Publicado en: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Edición 25 January 2022, 2022, Página(s) 772-784, ISSN 2329-9290
Editor: IEEE Advancing Technology for Humanity
DOI: 10.1109/taslp.2022.3145307

Direct Text to Speech Translation System Using Acoustic Units (se abrirá en una nueva ventana)

Autores: Victoria Mingote, Pablo Gimeno, Luis Vicente, Sameer Khurana, Antoine Laurent and Jarod Duret
Publicado en: IEEE Signal Processing Letters, Edición vol. 30, 2023, Página(s) 1262-1266, ISSN 1070-9908
Editor: Institute of Electrical and Electronics Engineers
DOI: 10.1109/lsp.2023.3313513

An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies (se abrirá en una nueva ventana)

Autores: Eduardo Lleida; Luis Javier Rodriguez-Fuentes; Javier Tejedor; Alfonso Ortega; Antonio Miguel; Virginia Bazán; Carmen Pérez; Alberto de Prada; Mikel Penagarikano; Amparo Varona; Germán Bordel; Doroteo Torre-Toledano; Aitor Álvarez; Haritz Arzelus
Publicado en: Applied Sciences; Volume 13; Edición 15;, Edición 5, 2023, Página(s) Pages: 8577, ISSN 2076-3417
Editor: MDPI
DOI: 10.3390/app13158577

Class token and knowledge distillation for multi-head self-attention speaker verification systems (se abrirá en una nueva ventana)

Autores: Victoria Mingote; Antonio Miguel; Alfonso Ortega; Eduardo Lleida
Publicado en: Digital Signal Processing, Edición 6, 2023, ISSN 1051-2004
Editor: Academic Press
DOI: 10.1016/j.dsp.2022.103859

Multimodal Diarization Systems by Training Enrollment Models as Identity Representations (se abrirá en una nueva ventana)

Autores: Victoria Mingote; Ignacio Viñals; Pablo Gimeno; Antonio Miguel; Alfonso Ortega; Eduardo Lleida
Publicado en: Applied Sciences, Edición 12-3, 2022, Página(s) 1141, ISSN 2076-3417
Editor: MDPI
DOI: 10.3390/app12031141

Wiener Filter and Deep Neural Networks: A Well-Balanced Pair for Speech Enhancement (se abrirá en una nueva ventana)

Autores: Dayana Ribas; Antonio Miguel; Alfonso Ortega; Eduardo Lleida
Publicado en: Applied Sciences, Edición 3 ; Volume 12; Edición 18;, 2022, Página(s) Pages: 9000, ISSN 2076-3417
Editor: MDPI
DOI: 10.3390/app12189000

Automatic Voice Disorder Detection Using Self-Supervised Representations (se abrirá en una nueva ventana)

Autores: Dayana Ribas; Miguel A. Pastor; Antonio Miguel; David Martinez; Alfonso Ortega; Eduardo Lleida
Publicado en: IEEE Access, Edición 6, 2023, Página(s) 14915-14927,, ISSN 2169-3536
Editor: Institute of Electrical and Electronics Engineers Inc.
DOI: 10.1109/access.2023.3243986

Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations (se abrirá en una nueva ventana)

Autores: Miguel A. Pastor; Dayana Ribas; Alfonso Ortega; Antonio Miguel; Eduardo Lleida
Publicado en: Applied Sciences, Edición 13, 2023, Página(s) 9062, ISSN 2076-3417
Editor: MDPI
DOI: 10.3390/app13169062

Unsupervised Adaptation of Deep Speech Activity Detection Models to Unseen Domains (se abrirá en una nueva ventana)

Autores: Pablo Gimeno; Dayana Ribas; Alfonso Ortega; Antonio Miguel; Eduardo Lleida
Publicado en: Applied Sciences, Edición 3, 2022, ISSN 2076-3417
Editor: MDPI
DOI: 10.3390/app12041832

The Domain Mismatch Problem in the Broadcast Speaker Attribution Task (se abrirá en una nueva ventana)

Autores: Ignacio Viñals, Alfonso Ortega, Antonio Miguel, Eduardo Lleida
Publicado en: Applied Sciences, Edición 11/18, 2021, Página(s) 8521, ISSN 2076-3417
Editor: MDPI
DOI: 10.3390/app11188521

Buscando datos de OpenAIRE...

Se ha producido un error en la búsqueda de datos de OpenAIRE

No hay resultados disponibles

Mi folleto 0 0