Lempel-Ziv Networks

Rebecca Saul, Mohammad Mahmudul Alam, John Hurwitz, Edward Raff, Tim Oates, James Holt
Proceedings on "I Can't Believe It's Not Better! - Understanding Deep Learning Through Empirical Falsification" at NeurIPS 2022 Workshops, PMLR 187:1-11, 2023.

Abstract

Sequence processing has long been a central area of machine learning research. Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences. Compression-based methods have demonstrated more robustness when processing such sequences — in particular, an approach pairing the Lempel-Ziv Jaccard Distance (LZJD) with the k-Nearest Neighbor algorithm has shown promise on long sequence problems (up to steps) involving malware classification. Unfortunately, use of LZJD is limited to discrete domains. To extend the benefits of LZJD to a continuous domain, we investigate the effectiveness of a deep-learning analog of the algorithm, the Lempel-Ziv Network. While we achieve successful proof-of-concept, we are unable to meaningfully improve on the performance of a standard LSTM across a variety of datasets and sequence processing tasks. In addition to presenting this negative result, our work highlights the problem of sub-par baseline tuning in newer research areas.

Cite this Paper


BibTeX
@InProceedings{pmlr-v187-saul23a, title = {Lempel-Ziv Networks }, author = {Saul, Rebecca and Alam, Mohammad Mahmudul and Hurwitz, John and Raff, Edward and Oates, Tim and Holt, James}, booktitle = {Proceedings on "I Can't Believe It's Not Better! - Understanding Deep Learning Through Empirical Falsification" at NeurIPS 2022 Workshops}, pages = {1--11}, year = {2023}, editor = {Antorán, Javier and Blaas, Arno and Feng, Fan and Ghalebikesabi, Sahra and Mason, Ian and Pradier, Melanie F. and Rohde, David and Ruiz, Francisco J. R. and Schein, Aaron}, volume = {187}, series = {Proceedings of Machine Learning Research}, month = {03 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v187/saul23a/saul23a.pdf}, url = {https://proceedings.mlr.press/v187/saul23a.html}, abstract = {Sequence processing has long been a central area of machine learning research. Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences. Compression-based methods have demonstrated more robustness when processing such sequences — in particular, an approach pairing the Lempel-Ziv Jaccard Distance (LZJD) with the k-Nearest Neighbor algorithm has shown promise on long sequence problems (up to steps) involving malware classification. Unfortunately, use of LZJD is limited to discrete domains. To extend the benefits of LZJD to a continuous domain, we investigate the effectiveness of a deep-learning analog of the algorithm, the Lempel-Ziv Network. While we achieve successful proof-of-concept, we are unable to meaningfully improve on the performance of a standard LSTM across a variety of datasets and sequence processing tasks. In addition to presenting this negative result, our work highlights the problem of sub-par baseline tuning in newer research areas.} }
Endnote
%0 Conference Paper %T Lempel-Ziv Networks %A Rebecca Saul %A Mohammad Mahmudul Alam %A John Hurwitz %A Edward Raff %A Tim Oates %A James Holt %B Proceedings on "I Can't Believe It's Not Better! - Understanding Deep Learning Through Empirical Falsification" at NeurIPS 2022 Workshops %C Proceedings of Machine Learning Research %D 2023 %E Javier Antorán %E Arno Blaas %E Fan Feng %E Sahra Ghalebikesabi %E Ian Mason %E Melanie F. Pradier %E David Rohde %E Francisco J. R. Ruiz %E Aaron Schein %F pmlr-v187-saul23a %I PMLR %P 1--11 %U https://proceedings.mlr.press/v187/saul23a.html %V 187 %X Sequence processing has long been a central area of machine learning research. Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences. Compression-based methods have demonstrated more robustness when processing such sequences — in particular, an approach pairing the Lempel-Ziv Jaccard Distance (LZJD) with the k-Nearest Neighbor algorithm has shown promise on long sequence problems (up to steps) involving malware classification. Unfortunately, use of LZJD is limited to discrete domains. To extend the benefits of LZJD to a continuous domain, we investigate the effectiveness of a deep-learning analog of the algorithm, the Lempel-Ziv Network. While we achieve successful proof-of-concept, we are unable to meaningfully improve on the performance of a standard LSTM across a variety of datasets and sequence processing tasks. In addition to presenting this negative result, our work highlights the problem of sub-par baseline tuning in newer research areas.
APA
Saul, R., Alam, M.M., Hurwitz, J., Raff, E., Oates, T. & Holt, J.. (2023). Lempel-Ziv Networks . Proceedings on "I Can't Believe It's Not Better! - Understanding Deep Learning Through Empirical Falsification" at NeurIPS 2022 Workshops, in Proceedings of Machine Learning Research 187:1-11 Available from https://proceedings.mlr.press/v187/saul23a.html.

Related Material