Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams

Yuan Feng, Yukun Cao, Hairu Wang, Xike Xie, S Kevin Zhou
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:16589-16603, 2025.

Abstract

Sketches, probabilistic structures for estimating item frequencies in infinite data streams with limited space, are widely used across various domains. Recent studies have shifted the focus from handcrafted sketches to neural sketches, leveraging memory-augmented neural networks (MANNs) to enhance the streaming compression capabilities and achieve better space-accuracy trade-offs. However, existing neural sketches struggle to scale across different data domains and space budgets due to inflexible MANN configurations. In this paper, we introduce a scalable MANN architecture that brings to life the Lego sketch, a novel sketch with superior scalability and accuracy. Much like assembling creations with modular Lego bricks, the Lego sketch dynamically coordinates multiple memory bricks to adapt to various space budgets and diverse data domains. Theoretical analysis and empirical studies demonstrate its scalability and superior space-accuracy trade-offs, outperforming existing handcrafted and neural sketches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-feng25a, title = {Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams}, author = {Feng, Yuan and Cao, Yukun and Wang, Hairu and Xie, Xike and Zhou, S Kevin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {16589--16603}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/feng25a/feng25a.pdf}, url = {https://proceedings.mlr.press/v267/feng25a.html}, abstract = {Sketches, probabilistic structures for estimating item frequencies in infinite data streams with limited space, are widely used across various domains. Recent studies have shifted the focus from handcrafted sketches to neural sketches, leveraging memory-augmented neural networks (MANNs) to enhance the streaming compression capabilities and achieve better space-accuracy trade-offs. However, existing neural sketches struggle to scale across different data domains and space budgets due to inflexible MANN configurations. In this paper, we introduce a scalable MANN architecture that brings to life the Lego sketch, a novel sketch with superior scalability and accuracy. Much like assembling creations with modular Lego bricks, the Lego sketch dynamically coordinates multiple memory bricks to adapt to various space budgets and diverse data domains. Theoretical analysis and empirical studies demonstrate its scalability and superior space-accuracy trade-offs, outperforming existing handcrafted and neural sketches.} }
Endnote
%0 Conference Paper %T Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams %A Yuan Feng %A Yukun Cao %A Hairu Wang %A Xike Xie %A S Kevin Zhou %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-feng25a %I PMLR %P 16589--16603 %U https://proceedings.mlr.press/v267/feng25a.html %V 267 %X Sketches, probabilistic structures for estimating item frequencies in infinite data streams with limited space, are widely used across various domains. Recent studies have shifted the focus from handcrafted sketches to neural sketches, leveraging memory-augmented neural networks (MANNs) to enhance the streaming compression capabilities and achieve better space-accuracy trade-offs. However, existing neural sketches struggle to scale across different data domains and space budgets due to inflexible MANN configurations. In this paper, we introduce a scalable MANN architecture that brings to life the Lego sketch, a novel sketch with superior scalability and accuracy. Much like assembling creations with modular Lego bricks, the Lego sketch dynamically coordinates multiple memory bricks to adapt to various space budgets and diverse data domains. Theoretical analysis and empirical studies demonstrate its scalability and superior space-accuracy trade-offs, outperforming existing handcrafted and neural sketches.
APA
Feng, Y., Cao, Y., Wang, H., Xie, X. & Zhou, S.K.. (2025). Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:16589-16603 Available from https://proceedings.mlr.press/v267/feng25a.html.

Related Material