[edit]
TS-RaMIA: Membership Inference Attacks for Symbolic Music Generation Models
Proceedings of Machine Learning Research, PMLR 303:1-15, 2026.
Abstract
Artists and rights holders face growing concerns about unauthorized use of their copyrighted works in training generative models. We introduce TS-RaMIA, a practical auditing framework enabling creators to test whether their symbolic music has been used without authorization. Unlike existing likelihood-based approaches that are confounded by piece length and density, TS-RaMIA exploits structural tokens—bar lines, positions, and tempo markers—encoding musical phrasing through sample-level analysis and rigorous debiasing. Our method combines (i) length matching and conditional calibration to remove spurious confounders, (ii) tail-of-top-k aggregation on structural tokens to amplify sparse memorization signals, and (iii) a lightweight meta-attacker fusing statistical cues via composer-stratified cross-validation. Evaluated on a 67M-parameter REMI Transformer trained on MAESTRO pieces, TS-RaMIA achieves AUC 0.826 and TPR@1%FPR 14.6%, while a debiased baseline drops to AUC 0.563. Cross-representation validation on NotaGen (ABC notation) yields comparable performance (AUC 0.73, TPR@1%FPR 8.9%), demonstrating transferability.