Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection

Mohammad Mahmudul Alam, Edward Raff, Stella R Biderman, Tim Oates, James Holt
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4042-4050, 2024.

Abstract

Malware detection is an interesting and valuable domain to work in because it has significant real-world impact and unique machine-learning challenges. We investigate existing long-range techniques and benchmarks and find that they’re not very suitable in this problem area. In this paper, we introduce Holographic Global Convolutional Networks (HGConv) that utilize the properties of Holographic Reduced Representations (HRR) to encode and decode features from sequence elements. Unlike other global convolutional methods, our method does not require any intricate kernel computation or crafted kernel design. HGConv kernels are defined as simple parameters learned through backpropagation. The proposed method has achieved new SOTA results on Microsoft Malware Classification Challenge, Drebin, and EMBER malware benchmarks. With log-linear complexity in sequence length, the empirical results demonstrate substantially faster run-time by HGConv compared to other methods achieving far more efficient scaling even with sequence length $\geq 100,000$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-mahmudul-alam24a, title = {Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection}, author = {Mahmudul Alam, Mohammad and Raff, Edward and R Biderman, Stella and Oates, Tim and Holt, James}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {4042--4050}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/mahmudul-alam24a/mahmudul-alam24a.pdf}, url = {https://proceedings.mlr.press/v238/mahmudul-alam24a.html}, abstract = {Malware detection is an interesting and valuable domain to work in because it has significant real-world impact and unique machine-learning challenges. We investigate existing long-range techniques and benchmarks and find that they’re not very suitable in this problem area. In this paper, we introduce Holographic Global Convolutional Networks (HGConv) that utilize the properties of Holographic Reduced Representations (HRR) to encode and decode features from sequence elements. Unlike other global convolutional methods, our method does not require any intricate kernel computation or crafted kernel design. HGConv kernels are defined as simple parameters learned through backpropagation. The proposed method has achieved new SOTA results on Microsoft Malware Classification Challenge, Drebin, and EMBER malware benchmarks. With log-linear complexity in sequence length, the empirical results demonstrate substantially faster run-time by HGConv compared to other methods achieving far more efficient scaling even with sequence length $\geq 100,000$.} }
Endnote
%0 Conference Paper %T Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection %A Mohammad Mahmudul Alam %A Edward Raff %A Stella R Biderman %A Tim Oates %A James Holt %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-mahmudul-alam24a %I PMLR %P 4042--4050 %U https://proceedings.mlr.press/v238/mahmudul-alam24a.html %V 238 %X Malware detection is an interesting and valuable domain to work in because it has significant real-world impact and unique machine-learning challenges. We investigate existing long-range techniques and benchmarks and find that they’re not very suitable in this problem area. In this paper, we introduce Holographic Global Convolutional Networks (HGConv) that utilize the properties of Holographic Reduced Representations (HRR) to encode and decode features from sequence elements. Unlike other global convolutional methods, our method does not require any intricate kernel computation or crafted kernel design. HGConv kernels are defined as simple parameters learned through backpropagation. The proposed method has achieved new SOTA results on Microsoft Malware Classification Challenge, Drebin, and EMBER malware benchmarks. With log-linear complexity in sequence length, the empirical results demonstrate substantially faster run-time by HGConv compared to other methods achieving far more efficient scaling even with sequence length $\geq 100,000$.
APA
Mahmudul Alam, M., Raff, E., R Biderman, S., Oates, T. & Holt, J.. (2024). Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4042-4050 Available from https://proceedings.mlr.press/v238/mahmudul-alam24a.html.

Related Material