End-to-End Probabilistic Inference for Nonstationary Audio Analysis

William Wilkinson, Michael Andersen, Joshua D. Reiss, Dan Stowell, Arno Solin
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6776-6785, 2019.

Abstract

A typical audio signal processing pipeline includes multiple disjoint analysis stages, including calculation of a time-frequency representation followed by spectrogram-based feature analysis. We show how time-frequency analysis and nonnegative matrix factorisation can be jointly formulated as a spectral mixture Gaussian process model with nonstationary priors over the amplitude variance parameters. Further, we formulate this nonlinear model’s state space representation, making it amenable to infinite-horizon Gaussian process regression with approximate inference via expectation propagation, which scales linearly in the number of time steps and quadratically in the state dimensionality. By doing so, we are able to process audio signals with hundreds of thousands of data points. We demonstrate, on various tasks with empirical data, how this inference scheme outperforms more standard techniques that rely on extended Kalman filtering.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-wilkinson19a, title = {End-to-End Probabilistic Inference for Nonstationary Audio Analysis}, author = {Wilkinson, William and Andersen, Michael and Reiss, Joshua D. and Stowell, Dan and Solin, Arno}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {6776--6785}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/wilkinson19a/wilkinson19a.pdf}, url = {https://proceedings.mlr.press/v97/wilkinson19a.html}, abstract = {A typical audio signal processing pipeline includes multiple disjoint analysis stages, including calculation of a time-frequency representation followed by spectrogram-based feature analysis. We show how time-frequency analysis and nonnegative matrix factorisation can be jointly formulated as a spectral mixture Gaussian process model with nonstationary priors over the amplitude variance parameters. Further, we formulate this nonlinear model’s state space representation, making it amenable to infinite-horizon Gaussian process regression with approximate inference via expectation propagation, which scales linearly in the number of time steps and quadratically in the state dimensionality. By doing so, we are able to process audio signals with hundreds of thousands of data points. We demonstrate, on various tasks with empirical data, how this inference scheme outperforms more standard techniques that rely on extended Kalman filtering.} }
Endnote
%0 Conference Paper %T End-to-End Probabilistic Inference for Nonstationary Audio Analysis %A William Wilkinson %A Michael Andersen %A Joshua D. Reiss %A Dan Stowell %A Arno Solin %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-wilkinson19a %I PMLR %P 6776--6785 %U https://proceedings.mlr.press/v97/wilkinson19a.html %V 97 %X A typical audio signal processing pipeline includes multiple disjoint analysis stages, including calculation of a time-frequency representation followed by spectrogram-based feature analysis. We show how time-frequency analysis and nonnegative matrix factorisation can be jointly formulated as a spectral mixture Gaussian process model with nonstationary priors over the amplitude variance parameters. Further, we formulate this nonlinear model’s state space representation, making it amenable to infinite-horizon Gaussian process regression with approximate inference via expectation propagation, which scales linearly in the number of time steps and quadratically in the state dimensionality. By doing so, we are able to process audio signals with hundreds of thousands of data points. We demonstrate, on various tasks with empirical data, how this inference scheme outperforms more standard techniques that rely on extended Kalman filtering.
APA
Wilkinson, W., Andersen, M., Reiss, J.D., Stowell, D. & Solin, A.. (2019). End-to-End Probabilistic Inference for Nonstationary Audio Analysis. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:6776-6785 Available from https://proceedings.mlr.press/v97/wilkinson19a.html.

Related Material