Wukong: Towards a Scaling Law for Large-Scale Recommendation

Buyun Zhang, Liang Luo, Yuxin Chen, Jade Nie, Xi Liu, Shen Li, Yanli Zhao, Yuchen Hao, Yantao Yao, Ellie Dingqiao Wen, Jongsoo Park, Maxim Naumov, Wenlin Chen
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:59421-59434, 2024.

Abstract

Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation poses significant challenges in adapting these models to increasingly more complex real-world datasets. In this paper, we propose an effective network architecture based purely on stacked factorization machines, and a synergistic upscaling strategy, collectively dubbed Wukong, to establish a scaling law in the domain of recommendation. Wukong’s unique design makes it possible to capture diverse, any-order of interactions simply through taller and wider layers. We conducted extensive evaluations on six public datasets, and our results demonstrate that Wukong consistently outperforms state-of-the-art models quality-wise. Further, we assessed Wukong’s scalability on an internal, large-scale dataset. The results show that Wukong retains its superiority in quality over state-of-the-art models, while holding the scaling law across two orders of magnitude in model complexity, extending beyond 100 GFLOP/example, where prior arts fall short.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-zhang24ao, title = {Wukong: Towards a Scaling Law for Large-Scale Recommendation}, author = {Zhang, Buyun and Luo, Liang and Chen, Yuxin and Nie, Jade and Liu, Xi and Li, Shen and Zhao, Yanli and Hao, Yuchen and Yao, Yantao and Wen, Ellie Dingqiao and Park, Jongsoo and Naumov, Maxim and Chen, Wenlin}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {59421--59434}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/zhang24ao/zhang24ao.pdf}, url = {https://proceedings.mlr.press/v235/zhang24ao.html}, abstract = {Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation poses significant challenges in adapting these models to increasingly more complex real-world datasets. In this paper, we propose an effective network architecture based purely on stacked factorization machines, and a synergistic upscaling strategy, collectively dubbed Wukong, to establish a scaling law in the domain of recommendation. Wukong’s unique design makes it possible to capture diverse, any-order of interactions simply through taller and wider layers. We conducted extensive evaluations on six public datasets, and our results demonstrate that Wukong consistently outperforms state-of-the-art models quality-wise. Further, we assessed Wukong’s scalability on an internal, large-scale dataset. The results show that Wukong retains its superiority in quality over state-of-the-art models, while holding the scaling law across two orders of magnitude in model complexity, extending beyond 100 GFLOP/example, where prior arts fall short.} }
Endnote
%0 Conference Paper %T Wukong: Towards a Scaling Law for Large-Scale Recommendation %A Buyun Zhang %A Liang Luo %A Yuxin Chen %A Jade Nie %A Xi Liu %A Shen Li %A Yanli Zhao %A Yuchen Hao %A Yantao Yao %A Ellie Dingqiao Wen %A Jongsoo Park %A Maxim Naumov %A Wenlin Chen %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-zhang24ao %I PMLR %P 59421--59434 %U https://proceedings.mlr.press/v235/zhang24ao.html %V 235 %X Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation poses significant challenges in adapting these models to increasingly more complex real-world datasets. In this paper, we propose an effective network architecture based purely on stacked factorization machines, and a synergistic upscaling strategy, collectively dubbed Wukong, to establish a scaling law in the domain of recommendation. Wukong’s unique design makes it possible to capture diverse, any-order of interactions simply through taller and wider layers. We conducted extensive evaluations on six public datasets, and our results demonstrate that Wukong consistently outperforms state-of-the-art models quality-wise. Further, we assessed Wukong’s scalability on an internal, large-scale dataset. The results show that Wukong retains its superiority in quality over state-of-the-art models, while holding the scaling law across two orders of magnitude in model complexity, extending beyond 100 GFLOP/example, where prior arts fall short.
APA
Zhang, B., Luo, L., Chen, Y., Nie, J., Liu, X., Li, S., Zhao, Y., Hao, Y., Yao, Y., Wen, E.D., Park, J., Naumov, M. & Chen, W.. (2024). Wukong: Towards a Scaling Law for Large-Scale Recommendation. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:59421-59434 Available from https://proceedings.mlr.press/v235/zhang24ao.html.

Related Material