Centralised vs decentralised anomaly detection: when local and imbalanced data are beneficial

Mirko Nardi, Lorenzo Valerio, Andrea Passarella
Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 154:7-20, 2021.

Abstract

In this paper, we address the problem of anomaly detection in decentralised settings. We took inspiration from the current edge computing trend, pushing towards the development of decentralised ML algorithms, i.e., the devices that collected or generated data are in charge of collaborating to train the ML models without sharing raw data . The challenges connected to this scenario are (i) data distributions of local datasets might be different, (ii) data is very often unlabelled, and (iii) devices have limited computational resources. We address them by proposing an unsupervised ensemble method for decentralised anomaly detection where the base learners are lightweight autoencoders. We aim to investigate whether an ensemble of lightweight models trained in isolation on non-IID and unlabelled local data can compete with heavier models trained in centralised settings. In a task of multi-category anomaly detection, our results show that our method exploits the data imbalance successfully to make accurate predictions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v154-nardi21a, title = {Centralised vs decentralised anomaly detection: when local and imbalanced data are beneficial}, author = {Nardi, Mirko and Valerio, Lorenzo and Passarella, Andrea}, booktitle = {Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications}, pages = {7--20}, year = {2021}, editor = {Moniz, Nuno and Branco, Paula and Torgo, Luis and Japkowicz, Nathalie and Woźniak, Michał and Wang, Shuo}, volume = {154}, series = {Proceedings of Machine Learning Research}, month = {17 Sep}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v154/nardi21a/nardi21a.pdf}, url = {https://proceedings.mlr.press/v154/nardi21a.html}, abstract = {In this paper, we address the problem of anomaly detection in decentralised settings. We took inspiration from the current edge computing trend, pushing towards the development of decentralised ML algorithms, i.e., the devices that collected or generated data are in charge of collaborating to train the ML models without sharing raw data . The challenges connected to this scenario are (i) data distributions of local datasets might be different, (ii) data is very often unlabelled, and (iii) devices have limited computational resources. We address them by proposing an unsupervised ensemble method for decentralised anomaly detection where the base learners are lightweight autoencoders. We aim to investigate whether an ensemble of lightweight models trained in isolation on non-IID and unlabelled local data can compete with heavier models trained in centralised settings. In a task of multi-category anomaly detection, our results show that our method exploits the data imbalance successfully to make accurate predictions.} }
Endnote
%0 Conference Paper %T Centralised vs decentralised anomaly detection: when local and imbalanced data are beneficial %A Mirko Nardi %A Lorenzo Valerio %A Andrea Passarella %B Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications %C Proceedings of Machine Learning Research %D 2021 %E Nuno Moniz %E Paula Branco %E Luis Torgo %E Nathalie Japkowicz %E Michał Woźniak %E Shuo Wang %F pmlr-v154-nardi21a %I PMLR %P 7--20 %U https://proceedings.mlr.press/v154/nardi21a.html %V 154 %X In this paper, we address the problem of anomaly detection in decentralised settings. We took inspiration from the current edge computing trend, pushing towards the development of decentralised ML algorithms, i.e., the devices that collected or generated data are in charge of collaborating to train the ML models without sharing raw data . The challenges connected to this scenario are (i) data distributions of local datasets might be different, (ii) data is very often unlabelled, and (iii) devices have limited computational resources. We address them by proposing an unsupervised ensemble method for decentralised anomaly detection where the base learners are lightweight autoencoders. We aim to investigate whether an ensemble of lightweight models trained in isolation on non-IID and unlabelled local data can compete with heavier models trained in centralised settings. In a task of multi-category anomaly detection, our results show that our method exploits the data imbalance successfully to make accurate predictions.
APA
Nardi, M., Valerio, L. & Passarella, A.. (2021). Centralised vs decentralised anomaly detection: when local and imbalanced data are beneficial. Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications, in Proceedings of Machine Learning Research 154:7-20 Available from https://proceedings.mlr.press/v154/nardi21a.html.

Related Material