Learning Normal Patterns in Musical Loops

Shayan Dadman, Bernt Arild Bremdal, Børre Bang, Rune Dalmo
Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), PMLR 307:86-105, 2026.

Abstract

We propose an unsupervised framework for analyzing audio patterns in musical loops using deep feature extraction and anomaly detection. Unlike prior methods limited by fixed input lengths, handcrafted features, or domain constraints, our approach combines a pre-trained Hierarchical Token-semantic Audio Transformer (HTS-AT) and Feature Fusion Mechanism (FFM) to generate representations from variable-length audio. These embeddings are analyzed by Deep Support Vector Data Description (Deep SVDD), which models normative patterns in a compact latent space. Experiments on bass and guitar datasets show our Deep SVDD models—especially with residual autoencoders—outperform baselines like Isolation Forest and PCA, achieving better anomaly separation. Our work provides a flexible, unsupervised method for effective pattern discovery in diverse audio samples.

Cite this Paper


BibTeX
@InProceedings{pmlr-v307-dadman26a, title = {Learning Normal Patterns in Musical Loops}, author = {Dadman, Shayan and Bremdal, Bernt Arild and Bang, B{\o}rre and Dalmo, Rune}, booktitle = {Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL)}, pages = {86--105}, year = {2026}, editor = {Kim, Hyeongji and Ramírez Rivera, Adín and Ricaud, Benjamin}, volume = {307}, series = {Proceedings of Machine Learning Research}, month = {06--08 Jan}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v307/main/assets/dadman26a/dadman26a.pdf}, url = {https://proceedings.mlr.press/v307/dadman26a.html}, abstract = {We propose an unsupervised framework for analyzing audio patterns in musical loops using deep feature extraction and anomaly detection. Unlike prior methods limited by fixed input lengths, handcrafted features, or domain constraints, our approach combines a pre-trained Hierarchical Token-semantic Audio Transformer (HTS-AT) and Feature Fusion Mechanism (FFM) to generate representations from variable-length audio. These embeddings are analyzed by Deep Support Vector Data Description (Deep SVDD), which models normative patterns in a compact latent space. Experiments on bass and guitar datasets show our Deep SVDD models—especially with residual autoencoders—outperform baselines like Isolation Forest and PCA, achieving better anomaly separation. Our work provides a flexible, unsupervised method for effective pattern discovery in diverse audio samples.} }
Endnote
%0 Conference Paper %T Learning Normal Patterns in Musical Loops %A Shayan Dadman %A Bernt Arild Bremdal %A Børre Bang %A Rune Dalmo %B Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL) %C Proceedings of Machine Learning Research %D 2026 %E Hyeongji Kim %E Adín Ramírez Rivera %E Benjamin Ricaud %F pmlr-v307-dadman26a %I PMLR %P 86--105 %U https://proceedings.mlr.press/v307/dadman26a.html %V 307 %X We propose an unsupervised framework for analyzing audio patterns in musical loops using deep feature extraction and anomaly detection. Unlike prior methods limited by fixed input lengths, handcrafted features, or domain constraints, our approach combines a pre-trained Hierarchical Token-semantic Audio Transformer (HTS-AT) and Feature Fusion Mechanism (FFM) to generate representations from variable-length audio. These embeddings are analyzed by Deep Support Vector Data Description (Deep SVDD), which models normative patterns in a compact latent space. Experiments on bass and guitar datasets show our Deep SVDD models—especially with residual autoencoders—outperform baselines like Isolation Forest and PCA, achieving better anomaly separation. Our work provides a flexible, unsupervised method for effective pattern discovery in diverse audio samples.
APA
Dadman, S., Bremdal, B.A., Bang, B. & Dalmo, R.. (2026). Learning Normal Patterns in Musical Loops. Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), in Proceedings of Machine Learning Research 307:86-105 Available from https://proceedings.mlr.press/v307/dadman26a.html.

Related Material