Skip to main content

Hands-free Voice-enabled Interface to Web Applications for Smart Home Environments

Deliverables

Report on recognition evaluation, technologies, tools

Report on the speech recognition engine evaluation results

Report on the final WASN platform, evaluated using the ASR engine

Final report on the wireless acoustic sensor network designed and implemented in LISTEN, including evaluation results based on the speech recognition engine

Searching for OpenAIRE data...

Publications

Recent Improvements to Neural Network based Acoustic Modeling in the EML Transcription Platform

Author(s): Volker Fischer
Published in: Proc. of DAGA 2016, 42 Jahrestagung für Akustik, 2016

A Robust Voice Activity Detection for Real-Time Automatic Speech Recognition

Author(s): O. Ghahabi, W. Zhou, V. Fischer
Published in: 2018

LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition

Author(s): K. Irie, Z. Tüske, T. Alkhouli, R. Schlüter, and H. Ney
Published in: INTERSPEECH, Issue 2016, 2016, Page(s) 3519-3523
DOI: 10.18154/rwth-conv-209197

Towards online-recognition with deep bidirectional LSTM acoustic models

Author(s): A. Zeyer, R. Schlüter, and H. Ney
Published in: INTERSPEECH, Issue 2016, 2016, Page(s) 3424-3428
DOI: 10.18154/rwth-conv-211067

Comparison of BLSTM-Layer-Specific Affine Transformationsfor Speaker Adaptation

Author(s): M. Kitza, R. Schlüter, and H. Ney
Published in: Interspeech, Issue 2018, 2018, Page(s) 877-881
DOI: 10.18154/rwth-conv-236793

The RWTH/UPB System Combination for the CHiME 2018 Workshop

Author(s): M. Kitza, W. Michel, C. Boeddeker, J. Heitkaemper, T. Menne, R. Schlüter, H. Ney, J. Schmalenstroeer, L. Drude, J. Heymann, R. Haeb-Umbach
Published in: The 5th International Workshop on Speech Processing in Everyday Environments (CHiME-5), Issue CHiME-5 (2018), 2018, Page(s) 53-57
DOI: 10.18154/rwth-conv-236789

Speaker Adapted Beamforming for Multi-Channel Automatic Speech Recognition

Author(s): Tobias Menne, Ralf Schluter, Hermann Ney
Published in: 2018 IEEE Spoken Language Technology Workshop (SLT), Issue 2018, 2018, Page(s) 535-541
DOI: 10.1109/slt.2018.8639547

Acoustic Modeling of Speech Waveform Based on Multi-Resolution, Neural Network Signal Processing

Author(s): Zoltan Tuske, Ralf Schluter, Hermann Ney
Published in: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Issue 2018, 2018, Page(s) 4859-4863
DOI: 10.1109/icassp.2018.8461871

Segmental Encoder-Decoder Models for Large Vocabulary Automatic Speech Recognition

Author(s): Eugen Beck, Mirko Hannemann, Patrick Dötsch, Ralf Schlüter, Hermann Ney
Published in: Interspeech 2018, Issue 2018, 2018, Page(s) 766-770
DOI: 10.21437/interspeech.2018-1212

Sequence Modeling and Alignment for LVCSR-Systems

Author(s): E. Beck, A. Zeyer, P. Doetsch, A. Merboldt, R. Schlüter, and H. Ney
Published in: ITG Conference on Speech Communication (ITG), Issue 2018, 2018

Learning Acoustic Features from the Raw Waveform for Automatic Speech Recognition

Author(s): T. Menne, Z. Tüske, R. Schlüter, and H. Ney
Published in: 44. Jahrestagung für Akustik der Deutschen Gesellschaft für Akustik, Issue 2018, 2018, Page(s) 1533-1536
DOI: 10.18154/rwth-conv-236778

Spatially localized direction of arrival estimation

Author(s): Delikaris-Manias, Symeon; McCormack, Leo; Pavlidi, Despoina; Mouchtaris, Athanasios
Published in: Issue 1, 2018
DOI: 10.5281/zenodo.3006164

Investigation into Joint Optimization of Single Channel Speech Enhancement and Acoustic Modeling for Robust ASR

Author(s): Menne, Tobias; Schlüter, Ralf; Ney, Hermann
Published in: ICASSP 2019<br/>ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ICASSP, Brighton, UK, 2019-05-12 - 2019-05-17, Issue 1, 2019
DOI: 10.18154/RWTH-2019-05286

ADAPTIVE MODELING OF SYNTHETIC NONSTATIONARY SINUSOIDS

Author(s): Caetano, Marcelo; Kafentzis, George; Mouchtaris, Athanasios
Published in: Issue 1, 2015
DOI: 10.5281/zenodo.3006542

Normalization of Partly Overlapping Audio Recordings from the Same Event Based on Relative Signal Powers

Author(s): Nikolaos Stefanakis, Athanasios Mouchtaris
Published in: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, Page(s) 3141-3145
DOI: 10.1109/ICASSP.2018.8461919

Prediction of LSTM-RNN Full Context States as a Subtask for N-gram Feedforward Language Models

Author(s): Irie, Kazuki; Lei, Zhihong; Schlüter, Ralf; Ney, Hermann
Published in: 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing : proceedings : April 15-20, 2018, Calgary Telus Convention Center, Calgary, Alberta, Canada / sponsored by: the Institute of Electrical and Electronics Engineers, Signal Processing Society<br/>IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, Calgary, Alberta, Canada, 2018-04-15 - 2018-, Issue 7, 2018
DOI: 10.18154/RWTH-CONV-236772

Acoustic Beamforming in Front of a Reflective Plane

Author(s): Nikolaos Stefanakis, Symeon Delikaris-Manias, Athanasios Mouchtaris
Published in: 2018 26th European Signal Processing Conference (EUSIPCO), 2018, Page(s) 26-30
DOI: 10.23919/EUSIPCO.2018.8553103

3D DOA estimation of multiple sound sources based on spatially constrained beamforming driven by intensity vectors

Author(s): Despoina Pavlidi, Symeon Delikaris-Manias, Ville Pulkki, Athanasias Mouchtaris
Published in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, Page(s) 96-100
DOI: 10.1109/ICASSP.2016.7471644

Improving narrowband DOA estimation of sound sources using the complex Watson distribution

Author(s): Alexandridis, Anastasios; Mouchtaris, Athanasios
Published in: EUSIPCO 2016, 2016
DOI: 10.5281/zenodo.161845

Direction of Arrival Estimation in front of a Reflective Plane Using a Circular Microphone Array

Author(s): Stefanakis, N.; Mouchtaris, A.
Published in: EUSIPCO 2016, 2016
DOI: 10.5281/zenodo.161668

3D localization of multiple audio sources utilizing 2D DOA histograms

Author(s): Delikaris-Manias, Symeon; Pavlidi, Despoina; Pulkki, Ville; Mouchtaris, Athanasios
Published in: EUSIPCO 2016, 2016
DOI: 10.5281/zenodo.162131

Development and Evaluation of a Digital MEMS Microphone Array for Spatial Audio

Author(s): Alexandridis, Anastasios; Papadakis, Stefanos; Pavlidi, Despoina; Mouchtaris, Athanasios
Published in: EUSIPCO 2016, 2016
DOI: 10.5281/zenodo.161849

Multiple sound source location estimation and counting in a wireless acoustic sensor network View Document

Author(s): Alexandridis, Anastasios; Mouchtaris, Athanasios
Published in: WASPAA 2015, 2015
DOI: 10.5281/zenodo.161840

DOA estimation with histogram analysis of spatially constrained active intensity vectors

Author(s): Symeon Delikaris-Manias, Despoina Pavlidi, Athanasios Mouchtaris, Ville Pulkki
Published in: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, Page(s) 526-530
DOI: 10.1109/ICASSP.2017.7952211

Towards wireless acoustic sensor networks for location estimation and counting of multiple speakers in real-life conditions

Author(s): Anastasios Alexandridis, Nikolaos Stefanakis, Athanasios Mouchtaris
Published in: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, Page(s) 6140-6144
DOI: 10.1109/ICASSP.2017.7953336

The RWTH/UPB/FORTH System Combination for the 4th CHiME Challenge Evaluation

Author(s): T. Menne, J. Heymann, A. Alexandridis, K. Irie, A. Zeyer, M. Kitza, P. Golik, L. Drude, R. Schlüter, H. Ney, R. Haeb-Umbach, A. Mouchtaris
Published in: CHiME Workshop, Issue CHiME-4 (2016), 2016
DOI: 10.18154/rwth-conv-211069

Multiple Sound Source Location Estimation in Wireless Acoustic Sensor Networks using DOA estimates: The Data-Association Problem

Author(s): Alexandridis, Anastasios; Mouchtaris, Athanasios
Published in: IEEE Transactions Audio, Speech, Language processing, Issue 1, 2018, ISSN 1558-7916
DOI: 10.5281/zenodo.1117766

Speech Analysis and Synthesis with a Computationally Efficient Adaptive Harmonic Model

Author(s): Morfi, Veronica; Degottex, Gilles; Mouchtaris, Athanasios
Published in: IEEE Transactions Audio, Speech, and Language Processing, Issue 1, 2015, ISSN 2329-9290
DOI: 10.5281/zenodo.2593232

Perpendicular Cross-Spectra Fusion for Sound Source Localization With a Planar Microphone Array

Author(s): Nikolaos Stefanakis, Despoina Pavlidi, Athanasios Mouchtaris
Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Issue 25/9, 2017, Page(s) 1517-1531, ISSN 2329-9290
DOI: 10.1109/TASLP.2017.2718733

Full-Band Quasi-Harmonic Analysis and Synthesis of Musical Instrument Sounds with Adaptive Sinusoids

Author(s): Marcelo Caetano, George Kafentzis, Athanasios Mouchtaris, Yannis Stylianou
Published in: Applied Sciences, Issue 6/5, 2016, Page(s) 127, ISSN 2076-3417
DOI: 10.3390/app6050127