11. ITG Fachtagung Sprachkommunikation
Erlangen, 24.-26. September 2014

Programm

Vorträge werden im Hörsaal H12 und Poster im R1 präsentiert. Einen Lage- und Raumplan des Tagungsortes finden Sie hier.

Mittwoch, 24.09.2014

14:00-15:00

Eröffnung und Begrüßung

Leitung: Walter Kellermann, Reinhold Häb-Umbach

Grußwort des Vizepräsidenten für Forschung
Prof. Dr.-Ing. Joachim Hornegger

Grußwort des Prodekans der Technischen Fakultät
Prof. Dr.-Ing. Wolfgang Schröder-Preikschat

Back to the Future of Digital Speech Communication
Peter Vary

15:00-15:15

Kaffeepause

15:15-16:30

Robust Speech Recognition

Leitung: Hans-Günter Hirsch

Übersichtsvortrag
Robost Speech Recognition
Hans-Günter Hirsch, Hochschule Niederrhein

Vorträge

Spectral Noise Tracking for Improved Nonstationary Noise Robust ASR
Aleksej Chinaev, Marc Püls, Reinhold Häb-Umbach, Universität Paderborn

Semi-Automatic Calibration for Dereverberation by Spectral Subtraction for Continous Speech Recognition
Korbinian Riedhammer1, Tobias Bocklet1, Juan Rafael Orozco-Arroyave1,2, Elmar Nöth1
1 FAU Erlangen-Nürnberg, 2 Universidad de Antioquia, Medellín, Colombia

Multimodal ASR by Turbo Decoding vs. Feature Concatenation: Where to Perform Information Integration?
Simon Receveur, Robin Weiß, Tim Fingscheidt, Technische Universität Braunschweig

16:30-18:45

Hands-Free Speech Communication

Leitung: Gerald Enzner, Heinrich Löllmann

Übersichtsvortrag
Trends in Hands-Free Communication
Gerald Enzner, RUB, Heinrich Löllmann, FAU Erlangen-Nürnberg

Poster (17:15 - 18:45)

P1: Effects of Resampling in Acoustic Echo Cancellation With Static Nonlinear Loudspeaker Distortion
Ingo Schalk-Schupp, Friedrich Faubel, Markus Buck, Nuance Communications, Ulm

P2: Combined Nonlinear Echo Cancellation and Residual Echo Suppression
Andreas Schwarz, Christian Hofmann and Walter Kellermann, FAU Erlangen-Nürnberg

P3: Efficient Multi-Channel Acoustic Echo Cancellation Using Constrained Sparse Filter Updates in the Subband Domain
Naveen Kumar Desiraju1, Simon Doclo2, Timo Gerkmann2, Tobias Wolff1
1 Nuance Communications, Ulm
2 Universität Oldenburg

P4: Selflearning Codebook Speech Enhancement
Florian Heese, Christoph Matthias Nelke, Markus Niermann, Peter Vary, RWTH Aachen

P5: An Open Source Corpus and Recording Software for Distant Speech Recognition With the Microsoft Kinect
Dirk Schnelle-Walka, Stephan Radeck-Arneth, Chris Biemann, Stefan Radomski, TU Darmstadt

P6: Dual Microphone Wind Noise Reduction by Exploiting the Complex Coherence
Christoph Matthias Nelke, Peter Vary, RWTH Aachen

P7: A Differential Microphone Array With Input Level Alignment, Directional Equalization and Fast Notch Adaptation for Handsfree Communication
Bernd Geiser1,2, Hauke Krüger1,2, Peter Vary1, Detlef Wiese3
1 RWTH Aachen
2 Javox Solutions GmbH, Aachen
3 Binauric SE, Hallbergmoos

17:15-18:45

Robust Speech Recognition

Leitung: Hans-Günter Hirsch

Poster

P8: Recognition of Noisy Speech by Starting the Likelihood Calculation at Voiced Segments
Hans-Günter Hirsch, Frank Kremer, Hochschule Niederrhein

P9: Robust Multimodal Human Machine Interaction Using the Kinect Sensor
Steffen Zeiler, Jan Cwiklak, Dorothea Kolossa, Ruhr-Universität Bochum

P10: Towards a Localised German Automatic Speech Recognition
Michael Stadtschnitzer, Christoph Schmidt, Daniel Stein, Fraunhofer IAIS, Sankt Augustin

P11: Scoring and Re-Ranking of ASR Hypotheses Using Phoneme Error Models
Martin Hacker, Elmar Nöth, FAU Erlangen-Nürnberg

17:15-18:45

Vorführungen: Speech Processing in the Real World

Leitung: Heinrich Löllmann, Andreas Schwarz

Außengelände

HD Voice Meets Car: A Hands-Free System With Bandwidth Extension and Wideband Echo Cancellation
Marc-André Jung, Patrick Bauer, Johannes Abel, Tim Fingscheidt, TU Braunschweig

Audiolabor

Wave-domain Acoustic Echo Cancellation
Christian Hofmann, Michael Bürger, Walter Kellermann, FAU Erlangen-Nürnberg

Demoraum 2

Real-Time Listening Enhancement for Mobile Phones
Florian Heese, Peter Vary, RWTH Aachen

19:00-20:00

Sitzung der ITG-Fachausschüsse 4.3 und 4.4

Donnerstag, 25.09.2014

8:30–9:30

IEEE SPS Distinguished Lecturer Talk - German Chapter Signal Processing

Phase: Unexplored Wilderness in Signal Enhancement

Akihiko K. Sugiyama

Moderation: Walter Kellermann

9:30–9:45

Kaffeepause

9:45–11:00

Spoken Language Understanding and Dialog Systems

Leitung: Dietrich Klakow

Vorträge

A Set of Quantitative User Experience Metrics for Multi-Modal Dialog Systems
Silke Witt, Fluential LLC, Sunnyvale, USA

Modeling Graphical and Speech User Interfaces With Widgets and Spidgets
Dominique Massonie, Christian Hacker, Timo Sowa, Elektrobit Automotive GmbH, Erlangen

The Impact of Word Alignment Accuracy on Audio-Visual Word Prominence Detection
Martin Heckmann, Honda Research, Offenbach (Main), Paschalis Mikias, Dorothea Kolossa, Ruhr-Universität Bochum

A New Evaluation Methodology for Speech Emotion Recognition With Confidence Output
Patrick Meyer, Tim Fingscheidt, TU Braunschweig

11:00–11:15

Kaffeepause

11:15–12:30

Speech Coding and Enhancement

Leitung: Henning Puder

Vorträge

Impact of Coding Noise on the Convergence of Blind Source Separation
Stefan Meier, Walter Kellermann, FAU Erlangen-Nürnberg

Audio Coding for Beamforming With Distributed Microphones
Matthias Pawig, Peter Vary, RWTH Aachen

Declipping of Speech Signals Using Frequency Selective Extrapolation
Markus Jonscher, Jürgen Seiler, André Kaup, FAU Erlangen-Nürnberg

Scalar Quantization With Optimized Receiver-Sided Adaptive Codebook Reconstruction Levels Controlled by a Predictor
Sai Han, Tim Fingscheidt, TU Braunschweig

On Reverse Waterfilling in Closed-Loop LPC With Noise Shaping
Hauke Krüger, Bernd Geiser, Peter Vary, RWTH Aachen

12:30–13:30

Mittagspause

13:30–16:30

Automotive Speech and Audio Processing

Leitung: Tim Fingscheidt, Gerhard Schmidt

Übersichtsvortrag
An Overview to the Automotive Speech Presentations
Tim Fingscheidt, TU Braunschweig, Gerhard Schmidt, CAU Kiel

Poster (13:45-16:30)

P1: Towards Acoustic Event Detection for Surveillance in Cars
Peter Transfeld, Simon Receveur, Tim Fingscheidt, TU Braunschweig

P2: Improved Performance Measures for Voice Activity Detection
Simon Graf1,2, Tobias Herbig1, Markus Buck1, Gerhard Schmidt2
1 Nuance Communications, Ulm
2 Christian-Albrechts-Universität zu Kiel

P3: Detection of Local Disturbances and Simultaneously Active Speakers for Distributed Speaker Dedicated Microphones in Cars
Timo Matheja, Markus Buck, Nuance Communications, Ulm

P4: Application of Frequency Shifting in In-Car Communication Systems
Jochen Withopf, Sebastian Rohde, Gerhard Schmidt, Christian-Albrechts-Universität zu Kiel

P5: SNR Estimation and Enhancement of Voiced Speech Based on Periodicity Analysis
Zhangli Chen, Volker Hohmann, Universität Oldenburg

P6: Improvement in Listener Comfort Through Noise Shaping Using a Modified Wiener Filter Approach
Vasudev Kandade Rajan1, Christin Baasch1, Mohamed Krini2, Gerhard Schmidt1
1 Christian-Albrechts-Universität zu Kiel
2 paragon AG, Delbrueck

P7: Reduction of Comb-Filter Effects by Alternating Measurement Orientations in Automotive Environments
Wolfgang Hess, Thure Beyer, Fraunhofer IIS, Erlangen, Michael Schöffler, International Audio Labs, Erlangen

13:45–16:30

Speech Coding and Enhancement

Leitung: Henning Puder

Poster

P8: Linear Predictive Coding With Backward Adaptation and Noise Shaping
Srikanth Korse, Hauke Krüger, Matthias Pawig, Peter Vary, RWTH Aachen

P9: A Multi-Stage, Multi-Channel Processing System for Overlapping Speech Separation in a Real Scenario
Rahil Mahdian Toroghi, Youssef Oualil, Dietrich Klakow, Universität des Saarlandes

13:45–16:30

Demo Session: Speech Processing in the Real World

Leitung: Heinrich Löllmann, Andreas Schwarz

Audiolabor

Advanced Binaural Beamforming System for Hearing Aids
Homayoun Kamkar Parsi, Jens Hain, Henning Puder, Siemens Audiologische Technik, Erlangen

Demoraum 1

Instrumental Quality Prediction for Text-To-Speech Systems
Florian Hinterleitner1, Sebastian Möller1, Ulrich Heute2,
1TU Berlin, 2CAU Kiel

Demoraum 2

Novel Features of the Spoken Dialog System Halef
Martin Mory, Markus Gutbrod, Stephan Schünemann, Ahmed Malatawy, Tarek Mehrez, Felix Neutatz, Dennis Schmidt, Moritz Teckenbrock, David Suendermann-Oeft, Duale Hochschule Baden-Württemberg

A Multi-Channel Soundcard as an Acoustic Sensor Node
Jörg Schmalenströer, Jörg Ullmann, Reinhold Häb-Umbach, Universität Paderborn

Online Word Prominence Detection
Martin Heckmann, Honda Research, Offenbach (Main)

Demoraum 3

Active Listening Assistant (AcListant)
Youssef Oualil1, Marc Schulder1, Anna Schmidt1, Rahil Mahdian1, Dietrich Klakow1,Oliver Oneser2, Matthias Kleinert2, Heiko Ehr2, Hartmut Helmke2
1Universität des Saarlandes, 2Deutsches Luft- und Raumfahrtzentrum Braunschweig (DLR)

Speech Recognition Client for House Automation
Hans-Günter Hirsch, Hochschule Niederrhein

17:30–23:30

Abendveranstaltung im DB Museum Nürnberg

Freitag, 26.09.2014

8:30–9:30

Automatic Speech Recognition Using Neural Networks

Ralf Schlüter

Moderation: Reinhold Häb-Umbach

9:30–9:45

Kaffeepause

9:45–11:00

Selected Topics in Speech Processing

Leitung: Peter Vary

Vorträge

Challenges in Acoustic Signal Enhancement for Human-Robot Communication
Heinrich Löllmann, Hendrik Barfuß, Antoine Deleforge, Walter Kellermann, FAU Erlangen-Nürnberg

System Identification With Perfect Sequence Excitation – Efficient NLMS vs. Inverse Cyclic Convolution
Christiane Antweiler, Stefan Kühl, Bastian Sauert, Peter Vary, RWTH Aachen

I-Vector Speaker Verification for Speech Degraded by Narrowband and Wideband Channels
Laura Fernandez Gallardo1,2, Michael Wagner2,3, Sebastian Möller1,2
1 Telekom Innovation Laboratories, TU Berlin
2 University of Canberra, Australia
3 Australian National University, Australia

On Bayesian Networks in Speech Signal Processing
Roland Maas, Christian Hümmer, Christian Hofmann, Walter Kellermann, FAU Erlangen-Nürnberg

11:00–11:15

Kaffeepause

11:15–12:30

Speech and Audio Perception-Based Models for Quality Evaluation

Leitung: Sebastian Möller, Hans-Wilhelm Gierlich, Ulrich Heute

Vorträge

Advances in Perceptual Modeling of Speech Quality in Telecommunications
Hans Wilhelm Gierlich, HEAD acoustics, Herzogenrath, Ulrich Heute, Christian-Albrechts-Universität zu Kiel, Sebastian Möller, TU Berlin

Instrumental Evaluation of In-Car Communication Systems
Anne Theiß, Gerhard Schmidt, Jochen Withopf, Christian Lüke, Christian-Albrechts-Universität zu Kiel

Speech Quality of VoIP: Bursty Packet Loss Revisited
Michał Sołoducha, Alexander Raake, TU Berlin

Orthogonal Audio Analyses for Disturbed Radio Broadcast
Jan Reimes, Frank Kettler, Udo Müsch, Marc Lepage, HEAD acoustics, Herzogenrath

New ITG Guideline for the Usability Evaluation of Smart Home Environments
Sebastian Möller, Klaus-Peter Engelbrecht, Stefan Hillmann, Patrick Ehrenbrink, TU Berlin

12:30–13:30

Mittagspause

13:30–15:45

Acoustic Sensor Networks

Leitung: Reinhold Häb-Umbach, Simon Doclo

Übersichtsvortrag

What are Acoustic Sensor Networks?
Reinhold Häb-Umbach,
Universität Paderborn, Simon Doclo, Universität Oldenburg

Vorträge

A Subspace-Based Perspective on Spatial Filtering Performance With Distributed and Co-Located Microphone Arrays
Maja Taseska, Emanuël Habets, International Audio Labs, Erlangen

Generalized Multichannel Wiener Filter for Spatially Distributed Microphones
Toby Christian Lawin-Ore1, Sebastian Stenzel2, Jürgen Freudenberger2, Simon Doclo1
1 Universität Oldenburg
2 HTWG Konstanz

Linear Combining of Audio Features for Signal Classification in Ad-Hoc Microphone Arrays
Sebastian Gergen, Rainer Martin, Ruhr-Universität Bochum

Coordinate Mapping Between an Acoustic and Visual Sensor Network in the Shape Domain for a Joint Self-Calibrating Speaker Tracking
Florian Jacob, Reinhold Häb-Umbach, Universität Paderborn

Poster (14:45 – 15:45)

P1: Detection of Audio Events With Repetitive Structure Using Generalized Autocorrelations
Frank Kurth, Alessia Cornaggia-Urrigshardt, Fraunhofer FKIE, Wachtberg

P2: Time-Frequency Dependent Multichannel Voice Activity Detection
Sebastian Stenzel, Jürgen Freudenberger, HTWG Konstanz

P3: Online Observation Error Model Estimation for Acoustic Sensor Network Synchronization
Jörg Schmalenströer, Weile Zhao, Reinhold Häb-Umbach, Universität Paderborn

14:45–15:45

Vorführungen: Speech Processing in the Real World

Leitung: Heinrich Löllmann, Andreas Schwarz

Audiolabor

Personalized Sound Rendering
Michael Bürger, Heinrich Löllmann, Walter Kellermann, FAU Erlangen-Nürnberg

Demoraum 1

Upcoming “Enhanced Voice Service” Speech Coding Standard From 3GPP
Tom Bäckström1, Guillaume Fuchs2,
1International Audio Labs Erlangen, FAU Erlangen-Nürnberg, 2Fraunhofer IIS, Erlangen

Demoraum 2

POLQA as an App: Embedded Perceptual Voice Quality Testing on Smartphones
Christian Schmidmer, Roland Bitto, Michael Keyhl, OPTICOM GmbH, Erlangen

15:45–16:00

Tagungsabschluss

Leitung: Walter Kellermann, Reinhold Häb-Umbach

Die Programmübersicht finden Sie hier und die Proceedings der Tagung hier.