Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion

Yujia Huang, Adishree Ghatare, Yuanzhe Liu, Ziniu Hu, Qinsheng Zhang, Chandramouli Shama Sastry, Siddharth Gururani, Sageev Oore, Yisong Yue
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:19772-19797, 2024.

Abstract

We study the problem of symbolic music generation (e.g., generating piano rolls), with a technical focus on non-differentiable rule guidance. Musical rules are often expressed in symbolic form on note characteristics, such as note density or chord progression, many of which are non-differentiable which pose a challenge when using them for guided diffusion. We propose Stochastic Control Guidance (SCG), a novel guidance method that only requires forward evaluation of rule functions that can work with pre-trained diffusion models in a plug-and-play way, thus achieving training-free guidance for non-differentiable rules for the first time. Additionally, we introduce a latent diffusion architecture for symbolic music generation with high time resolution, which can be composed with SCG in a plug-and-play fashion. Compared to standard strong baselines in symbolic music generation, this framework demonstrates marked advancements in music quality and rule-based controllability, outperforming current state-of-the-art generators in a variety of settings. For detailed demonstrations, code and model checkpoints, please visit our project website.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-huang24g, title = {Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion}, author = {Huang, Yujia and Ghatare, Adishree and Liu, Yuanzhe and Hu, Ziniu and Zhang, Qinsheng and Shama Sastry, Chandramouli and Gururani, Siddharth and Oore, Sageev and Yue, Yisong}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {19772--19797}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/huang24g/huang24g.pdf}, url = {https://proceedings.mlr.press/v235/huang24g.html}, abstract = {We study the problem of symbolic music generation (e.g., generating piano rolls), with a technical focus on non-differentiable rule guidance. Musical rules are often expressed in symbolic form on note characteristics, such as note density or chord progression, many of which are non-differentiable which pose a challenge when using them for guided diffusion. We propose Stochastic Control Guidance (SCG), a novel guidance method that only requires forward evaluation of rule functions that can work with pre-trained diffusion models in a plug-and-play way, thus achieving training-free guidance for non-differentiable rules for the first time. Additionally, we introduce a latent diffusion architecture for symbolic music generation with high time resolution, which can be composed with SCG in a plug-and-play fashion. Compared to standard strong baselines in symbolic music generation, this framework demonstrates marked advancements in music quality and rule-based controllability, outperforming current state-of-the-art generators in a variety of settings. For detailed demonstrations, code and model checkpoints, please visit our project website.} }
Endnote
%0 Conference Paper %T Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion %A Yujia Huang %A Adishree Ghatare %A Yuanzhe Liu %A Ziniu Hu %A Qinsheng Zhang %A Chandramouli Shama Sastry %A Siddharth Gururani %A Sageev Oore %A Yisong Yue %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-huang24g %I PMLR %P 19772--19797 %U https://proceedings.mlr.press/v235/huang24g.html %V 235 %X We study the problem of symbolic music generation (e.g., generating piano rolls), with a technical focus on non-differentiable rule guidance. Musical rules are often expressed in symbolic form on note characteristics, such as note density or chord progression, many of which are non-differentiable which pose a challenge when using them for guided diffusion. We propose Stochastic Control Guidance (SCG), a novel guidance method that only requires forward evaluation of rule functions that can work with pre-trained diffusion models in a plug-and-play way, thus achieving training-free guidance for non-differentiable rules for the first time. Additionally, we introduce a latent diffusion architecture for symbolic music generation with high time resolution, which can be composed with SCG in a plug-and-play fashion. Compared to standard strong baselines in symbolic music generation, this framework demonstrates marked advancements in music quality and rule-based controllability, outperforming current state-of-the-art generators in a variety of settings. For detailed demonstrations, code and model checkpoints, please visit our project website.
APA
Huang, Y., Ghatare, A., Liu, Y., Hu, Z., Zhang, Q., Shama Sastry, C., Gururani, S., Oore, S. & Yue, Y.. (2024). Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:19772-19797 Available from https://proceedings.mlr.press/v235/huang24g.html.

Related Material