Similarity of Sound

Computational measures of perceptual similarity of sound.


Concept

How do we quantify whether two sounds are “similar”? This question arises everywhere in audio research — from evaluating whether a synthetic reverb matches a measured room, to interpolating between spatial impulse responses, to assessing whether rendering artifacts are perceptible.

Our research develops metrics and methods for comparing sounds and acoustic environments in perceptually meaningful ways, bridging the gap between signal-level differences and what listeners actually hear.


Similarity Metrics for Late Reverberation

We developed and compared computational metrics that capture perceptual similarity between reverberant sound fields[9], going beyond simple energy-based measures to account for spectral, temporal, and spatial structure. These metrics enable objective evaluation of reverberation rendering quality.

Code: similarity-metrics-for-rirs — Python toolkit for computing and comparing similarity metrics for room impulse responses.

Demo: Reverb similarity listening examples — audio comparisons with different metrics.


Optimal Transport for Audio

Optimal transport theory provides a principled mathematical framework for comparing distributions. We apply it to quantify distances between time-frequency representations of audio signals[10] and to interpolate between spatial room impulse responses[6], yielding smooth, perceptually meaningful transitions between acoustic environments.

Code: SRIR Interpolation via Optimal Transport — perceptually informed interpolation of spatial room impulse responses using partial optimal transport.


Source Signal Similarity & Room Perception

How does the source signal itself affect our ability to hear differences between rooms? We investigated how the similarity of source material — speech, music, noise — influences listeners’ ability to distinguish between different positions in a room[8], revealing that source characteristics interact strongly with spatial perception.


Perceptual Roughness

Sparse noise signals used in efficient spatial audio rendering can introduce roughness artifacts. We quantified the perceptual roughness of spatially assigned sparse noise[2][7], establishing the thresholds where rendering artifacts become audible and guiding the design of velvet noise reverberators.

Demo: Velvet noise roughness analysis — frequency-dependent temporal roughness of velvet noise, with listening examples.


Room Acoustic Memory & Room Transitions

How accurately do listeners remember and compare room acoustics? Our experiments reveal the limits of acoustic memory[3][6], directly informing the design of systems that transition between coupled rooms. This includes work on the perceived aperture position during room transitions[4] and the perceptual analysis of directional late reverberation[1].

Data: Room transition datasets — measured acoustic data of transitions between coupled rooms.

Demo: Where you are in a room — interactive exploration of room position perception.


References

YearAuthorsArticle
[1]2021B. Alary, P. Massé, S. J. Schlecht et al.Perceptual analysis of directional late reverberation
[2]2021N. Meyer-Kahlen, S. J. Schlecht & T. LokkiPerceptual roughness of spatially assigned sparse noise for rendering reverberation
[3]2022N. Meyer-Kahlen, S. J. Schlecht & T. LokkiClearly audible room acoustical differences may not reveal where you are in a room
[4]2022T. McKenzie et al.The auditory perceived aperture position of the transition between rooms
[5]2023T. McKenzie et al.Auralization of measured room transitions in VR
[6]2023N. Meyer-Kahlen et al.Interpolation of spatial room impulse responses using partial optimal transport
[7]2023N. Meyer-Kahlen et al.How smooth do you think I am: frequency-dependent temporal roughness of velvet noise
[8]2023T. McKenzie, S. J. Schlecht et al.The role of source signal similarity in distinguishing between different positions in a room
[9]2024G. Dal Santo, N. Meyer-Kahlen et al.Similarity metrics for late reverberation
[10]2024V. Välimäki, K. Prawda & S. J. SchlechtTime-frequency audio similarity using optimal transport

Sebastian J. Schlecht
Sebastian J. Schlecht
Associate Professor for Signal Processing

I like to research audio and acoustics signal processing with and without ML.