Spring til indhold
Forside

Aalborg East Campus

Department of Electronic Systems.

PhD defence by Andreas Jonas Fuglsig

Title: Joint Far- and Near-end Speech and Listening Enhancement with a Minimum Processing Perspective

Aalborg East Campus

Fredrik Bajers Vej 7B3-104,
9220 Aalborg East

  • 17.01.2025 Kl. 11:00 - 14:00

  • English

  • On location

Aalborg East Campus

Fredrik Bajers Vej 7B3-104,
9220 Aalborg East

17.01.2025 Kl. 11:00 - 14:0017.01.2025 Kl. 11:00 - 14:00

English

On location

Department of Electronic Systems.

PhD defence by Andreas Jonas Fuglsig

Title: Joint Far- and Near-end Speech and Listening Enhancement with a Minimum Processing Perspective

Aalborg East Campus

Fredrik Bajers Vej 7B3-104,
9220 Aalborg East

  • 17.01.2025 Kl. 11:00 - 14:00

  • English

  • On location

Aalborg East Campus

Fredrik Bajers Vej 7B3-104,
9220 Aalborg East

17.01.2025 Kl. 11:00 - 14:0017.01.2025 Kl. 11:00 - 14:00

English

On location

Abstract

Speech communication across different environments presents challenges due to background noise affecting both the talker and listener. These disturbances can be highly annoying, leading to a decline in the perceived speech quality. Furthermore, they can also make it difficult for the listener to understand what was said, causing a drop in speech intelligibility. To address disturbances from the talker’s environment (the so-called far-end), far-end speech enhancement algorithms are commonly employed to improve intelligibility and quality by reducing the noise in the recorded signals. Similarly, to overcome disturbances from the listener’s environment (the so-called near-end), near-end listening enhancement algorithms are used to enhance intelligibility and quality by preprocessing signals before playback in the noisy near-end environment.
 
Traditionally, far- and near-end speech and listening enhancement systems have been developed independently, often overlooking the fact that noise can exist simultaneously at both ends. Treating these systems separately can result in reduced performance due to, e.g., conflicting processing goals, or excessive or insufficient processing because of an unawareness of the remaining noise and processing artifacts from the other end. Alternatively, by using joint far- and near-end speech and listening enhancement, which considers both environments and all processing steps simultaneously, it is possible to improve performance compared to the classic blind concatenation of far- and near-end systems.
 
While blind and joint far- and near-end speech and listening enhancement algorithms perform well in noisy conditions, they sometimes excessively prioritize noise suppression or the maximization of intelligibility at the near-end. This can lead to an exaggerated processing of the speech signals which can cause unwanted speech distortions and reduced quality, especially in quieter conditions where intelligibility is already high. However, using a minimum processing approach can help strike a balance between noise reduction, intelligibility improvement and high quality.
 
In this thesis, we explore joint far- and near-end speech and listening enhancement, along with minimum processing. Our approach aims to enhance intelligibility and quality beyond blind processing while minimizing speech distortions compared to maximum processing. Particularly, we propose a joint far- and near-end speech intelligibility enhancement algorithm based on maximization of a speech intelligibility predictor. Additionally, we propose and study the use of a minimum processing formulation of near-end listening enhancement. Finally, we propose a combined joint far- and near-end minimum processing framework and comprehensively study and explore the effects of minimum versus maximum processing, joint versus blind processing, and their cross-combination.

After the defence there will be a small reception at Fredrik Bajers Vej 7, A4-106

Attendees

in the defence
Assessment committee
  • Associate Professor Jan Dimon Bendtsen (Chair), Aalborg University, Denmark
  • Associate Professor Richard Christian Hendriks, Delft University of Technology, The Netherlands
  • Assistant Professor Aki Härmä, Maastricht University, The Netherlands
Moderator
  • Professor Søren Bech, Aalborg University, Denmark
PhD Supervisors
  • Professor Zheng-Hua Tan, Aalborg University, Denmark
  • Professor Jan Østergaard, Aalborg University, Denmark
  • Project Engineer Lars Søndergaard Bertelsen, RTX
  • CTO Jens Christian Lindof, RTX