Nonlinear Denoising, Linear Demixing

Rainer Kelz, Gerhard Widmer
Proceedings on "I (Still) Can't Believe It's Not Better!" at NeurIPS 2021 Workshops, PMLR 163:54-58, 2022.

Abstract

We cast the combinatorial problem of polyphonic piano transcription as a two stage process. A nonlinear denoising stage maps spectrogram representations of arbitrary piano music with unknown timbral characteristics onto a canonical spectrogram representation with known timbral characteristics. A subsequent linear demixing stage aims to exploit the knowledge about the canonical timbral characteristics. The idea behind this two stage process is to try to elegantly sidestep any musical bias inherent in the training dataset that is easily picked up by a single stage, nonlinear (neural) transcription system (with large capacity). The two stage process tries not to force the nonlinear system to solve a combinatorial problem, which is more amenable to being solved by a linear decomposition method that has the superposition property. Using the simplest setup we could think of, we obtain (rather mixed (pun intended)) results on a standard polyphonic piano transcription dataset - the two stage process still suffers from generalization problems after the first stage, which the second stage is unable to compensate.

Cite this Paper


BibTeX
@InProceedings{pmlr-v163-kelz22a, title = {Nonlinear Denoising, Linear Demixing}, author = {Kelz, Rainer and Widmer, Gerhard}, booktitle = {Proceedings on "I (Still) Can't Believe It's Not Better!" at NeurIPS 2021 Workshops}, pages = {54--58}, year = {2022}, editor = {Pradier, Melanie F. and Schein, Aaron and Hyland, Stephanie and Ruiz, Francisco J. R. and Forde, Jessica Z.}, volume = {163}, series = {Proceedings of Machine Learning Research}, month = {13 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v163/kelz22a/kelz22a.pdf}, url = {https://proceedings.mlr.press/v163/kelz22a.html}, abstract = {We cast the combinatorial problem of polyphonic piano transcription as a two stage process. A nonlinear denoising stage maps spectrogram representations of arbitrary piano music with unknown timbral characteristics onto a canonical spectrogram representation with known timbral characteristics. A subsequent linear demixing stage aims to exploit the knowledge about the canonical timbral characteristics. The idea behind this two stage process is to try to elegantly sidestep any musical bias inherent in the training dataset that is easily picked up by a single stage, nonlinear (neural) transcription system (with large capacity). The two stage process tries not to force the nonlinear system to solve a combinatorial problem, which is more amenable to being solved by a linear decomposition method that has the superposition property. Using the simplest setup we could think of, we obtain (rather mixed (pun intended)) results on a standard polyphonic piano transcription dataset - the two stage process still suffers from generalization problems after the first stage, which the second stage is unable to compensate.} }
Endnote
%0 Conference Paper %T Nonlinear Denoising, Linear Demixing %A Rainer Kelz %A Gerhard Widmer %B Proceedings on "I (Still) Can't Believe It's Not Better!" at NeurIPS 2021 Workshops %C Proceedings of Machine Learning Research %D 2022 %E Melanie F. Pradier %E Aaron Schein %E Stephanie Hyland %E Francisco J. R. Ruiz %E Jessica Z. Forde %F pmlr-v163-kelz22a %I PMLR %P 54--58 %U https://proceedings.mlr.press/v163/kelz22a.html %V 163 %X We cast the combinatorial problem of polyphonic piano transcription as a two stage process. A nonlinear denoising stage maps spectrogram representations of arbitrary piano music with unknown timbral characteristics onto a canonical spectrogram representation with known timbral characteristics. A subsequent linear demixing stage aims to exploit the knowledge about the canonical timbral characteristics. The idea behind this two stage process is to try to elegantly sidestep any musical bias inherent in the training dataset that is easily picked up by a single stage, nonlinear (neural) transcription system (with large capacity). The two stage process tries not to force the nonlinear system to solve a combinatorial problem, which is more amenable to being solved by a linear decomposition method that has the superposition property. Using the simplest setup we could think of, we obtain (rather mixed (pun intended)) results on a standard polyphonic piano transcription dataset - the two stage process still suffers from generalization problems after the first stage, which the second stage is unable to compensate.
APA
Kelz, R. & Widmer, G.. (2022). Nonlinear Denoising, Linear Demixing. Proceedings on "I (Still) Can't Believe It's Not Better!" at NeurIPS 2021 Workshops, in Proceedings of Machine Learning Research 163:54-58 Available from https://proceedings.mlr.press/v163/kelz22a.html.

Related Material