Differentiable Audio Processing & Deep Learning

Bridging classical signal processing with modern machine learning for audio.


Concept

Classical audio signal processing offers transparent, interpretable algorithms — but tuning their parameters to match complex acoustic targets remains an open challenge. Deep learning brings powerful optimization, but often at the cost of interpretability and efficiency.

Our research bridges these worlds through differentiable signal processing (DDSP): embedding classical audio structures (filters, delays, feedback networks) into differentiable computation graphs that can be optimized end-to-end with gradient descent. Alongside this, we develop neural network approaches for tasks where traditional methods fall short.


Differentiable Feedback Delay Networks

Making FDN parameters differentiable allows reverberation to be optimized toward target decay, coloration, or perceptual objectives using gradient-based training. We showed that even tiny FDN configurations produce high-quality colorless reverberation when optimized this way[3][10], and developed RIR2FDN[6] for automatically synthesizing FDN configurations that match measured room impulse responses.

Code: diff-fdn-colorless — optimize FDN parameters for spectrally flat reverberation via gradient descent.

Demo: Colorless FDN examples — audio comparisons.

Code: rir2fdn — analyze measured RIRs and synthesize matching FDN configurations.

Demo: RIR2FDN project page — listening examples of RIR-to-FDN conversion.


FLAMO: Differentiable Audio Systems Library

FLAMO (Frequency-sampling Library for Audio-Module Optimization)[9] is a PyTorch library for building and optimizing differentiable linear time-invariant audio systems. It provides differentiable gains, filters (biquads, state variable filters, graphic EQs), delays, and transforms that can be chained into complex architectures and trained end-to-end.

Documentation · PyPI


Differentiable Active Acoustics

Reverberation enhancement systems form an electro-acoustic feedback loop whose stability is critical. We treat this loop as a differentiable system and optimize stability and performance via gradient descent[5], opening new possibilities for automated active acoustics design.

Demo: Differentiable active acoustics project page — demonstrations of stability optimization.


Room Impulse Response Completion

Rendering immersive audio in VR and games requires fast RIR generation. DECOR (Deep Exponential Completion Of Room impulse responses)[8] predicts late reverberation from only the early 50 ms of a measured response — an encoder-decoder network that synthesizes multi-exponential decay envelopes of filtered noise.

Demo: RIR completion project page — interactive examples.


Neural Decay Analysis

DecayFitNet[1] is a lightweight neural network that replaces brittle iterative fitting for multi-exponential energy decay estimation. Trained on synthetic data, it provides deterministic inference without manual tuning, validated on over 20,000 real acoustic measurements.


Physical Modeling with Neural Operators

Fourier neural operators[2] learn to approximate PDE solutions for physical models of musical instruments, enabling real-time sound synthesis that captures the physics of vibrating strings and resonant bodies.

Demo: FNO for physical modeling — Fourier neural operator examples.


KLANN: Knowledge-Leveraging Audio Networks

KLANN[7] integrates domain knowledge into neural network architectures for audio processing, combining the efficiency of classical signal processing structures with the flexibility of learned parameters.

Demo: KLANN examples — audio processing results.


References

YearAuthorsArticle
[1]2022G. Götz, S. J. Schlecht & V. PulkkiDecayFitNet: neural network for energy decay analysis
[2]2022J. D. Parker, S. J. Schlecht et al.Physical modeling with Fourier neural operators
[3]2023G. Dal Santo, K. Prawda et al.Differentiable feedback delay network for colorless reverberation
[4]2023L. Luoma, P. Fricker & S. J. SchlechtDeep learning for loudspeaker digital twin creation
[5]2024G. M. De Bortoli, G. Dal Santo et al.Differentiable active acoustics: optimizing stability via gradient descent
[6]2024G. Dal Santo et al.RIR2FDN: Improved room impulse response analysis and synthesis
[7]2024V. Huhtala, L. Juvela & S. J. SchlechtKLANN: Knowledge-leveraging artificial neural network
[8]2025J. Lin, G. Götz & S. J. SchlechtDeep room impulse response completion
[9]2025G. Dal Santo et al.FLAMO: Frequency-sampling library for audio-module optimization
[10]2025G. Dal Santo, K. Prawda et al.Optimizing tiny colorless feedback delay networks
[11]2025M. Scerbo, S. J. Schlecht et al.Modeling feedback delay network output equivalences

Sebastian J. Schlecht
Sebastian J. Schlecht
Associate Professor for Signal Processing

I like to research audio and acoustics signal processing with and without ML.