Robust Data Clustering with Outliers via Transformed Tensor Low-Rank Representation

Tong Wu
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1756-1764, 2024.

Abstract

Recently, tensor low-rank representation (TLRR) has become a popular tool for tensor data recovery and clustering, due to its empirical success and theoretical guarantees. However, existing TLRR methods consider Gaussian or gross sparse noise, inevitably leading to performance degradation when the tensor data are contaminated by outliers or sample-specific corruptions. This paper develops an outlier-robust tensor low-rank representation (OR-TLRR) method that provides outlier detection and tensor data clustering simultaneously based on the t-SVD framework. For tensor observations with arbitrary outlier corruptions, OR-TLRR has provable performance guarantee for exactly recovering the row space of clean data and detecting outliers under mild conditions. Moreover, an extension of OR-TLRR is proposed to handle the case when parts of the data are missing. Finally, extensive experimental results on synthetic and real data demonstrate the effectiveness of the proposed algorithms. We release our code at \url{https://github.com/twugithub/2024-AISTATS-ORTLRR}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-wu24c, title = { Robust Data Clustering with Outliers via Transformed Tensor Low-Rank Representation }, author = {Wu, Tong}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {1756--1764}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/wu24c/wu24c.pdf}, url = {https://proceedings.mlr.press/v238/wu24c.html}, abstract = { Recently, tensor low-rank representation (TLRR) has become a popular tool for tensor data recovery and clustering, due to its empirical success and theoretical guarantees. However, existing TLRR methods consider Gaussian or gross sparse noise, inevitably leading to performance degradation when the tensor data are contaminated by outliers or sample-specific corruptions. This paper develops an outlier-robust tensor low-rank representation (OR-TLRR) method that provides outlier detection and tensor data clustering simultaneously based on the t-SVD framework. For tensor observations with arbitrary outlier corruptions, OR-TLRR has provable performance guarantee for exactly recovering the row space of clean data and detecting outliers under mild conditions. Moreover, an extension of OR-TLRR is proposed to handle the case when parts of the data are missing. Finally, extensive experimental results on synthetic and real data demonstrate the effectiveness of the proposed algorithms. We release our code at \url{https://github.com/twugithub/2024-AISTATS-ORTLRR}. } }
Endnote
%0 Conference Paper %T Robust Data Clustering with Outliers via Transformed Tensor Low-Rank Representation %A Tong Wu %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-wu24c %I PMLR %P 1756--1764 %U https://proceedings.mlr.press/v238/wu24c.html %V 238 %X Recently, tensor low-rank representation (TLRR) has become a popular tool for tensor data recovery and clustering, due to its empirical success and theoretical guarantees. However, existing TLRR methods consider Gaussian or gross sparse noise, inevitably leading to performance degradation when the tensor data are contaminated by outliers or sample-specific corruptions. This paper develops an outlier-robust tensor low-rank representation (OR-TLRR) method that provides outlier detection and tensor data clustering simultaneously based on the t-SVD framework. For tensor observations with arbitrary outlier corruptions, OR-TLRR has provable performance guarantee for exactly recovering the row space of clean data and detecting outliers under mild conditions. Moreover, an extension of OR-TLRR is proposed to handle the case when parts of the data are missing. Finally, extensive experimental results on synthetic and real data demonstrate the effectiveness of the proposed algorithms. We release our code at \url{https://github.com/twugithub/2024-AISTATS-ORTLRR}.
APA
Wu, T.. (2024). Robust Data Clustering with Outliers via Transformed Tensor Low-Rank Representation . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1756-1764 Available from https://proceedings.mlr.press/v238/wu24c.html.

Related Material