Supervised Contrastive Embeddings for Binaural Source Localization

Duowei Tang, Maja Taseska, Toon van Waterschoot

WASPAA 2019, New Paltz, USA

Abstract

Recent data-driven approaches for binaural source localization are able to learn the non-linear functions that map measured binaural cues to source locations. This is done either by learning a parametric map directly using training data, or by learning a lowdimensional representation (embedding) of the binaural cues that is consistent with the source locations. In this paper, we adopt the second approach and propose a parametric embedding to map the binaural cues to a low-dimensional space, where localization can be done with a nearest-neighbor regression. We implement the embedding using a neural network, optimized to map points that are close in the latent space (the space of source azimuths or elevations) to nearby points in the embedding. We show that the proposed embedding generalizes well in acoustic conditions different from those encountered during training, and provides better results than unsupervised embeddings previously used for localization.