Online Forgetting Process for Linear Regression Models

Yuantong Li, Chi-Hua Wang, Guang Cheng
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:217-225, 2021.

Abstract

Motivated by the EU’s "Right To Be Forgotten" regulation, we initiate a study of statistical data deletion problems where users’ data are accessible only for a limited period of time. This setting is formulated as an online supervised learning task with \textit{constant memory limit}. We propose a deletion-aware algorithm \texttt{FIFD-OLS} for the low dimensional case, and witness a catastrophic rank swinging phenomenon due to the data deletion operation, which leads to statistical inefficiency. As a remedy, we propose the \texttt{FIFD-Adaptive Ridge} algorithm with a novel online regularization scheme, that effectively offsets the uncertainty from deletion. In theory, we provide the cumulative regret upper bound for both online forgetting algorithms. In the experiment, we showed \texttt{FIFD-Adaptive Ridge} outperforms the ridge regression algorithm with fixed regularization level, and hopefully sheds some light on more complex statistical models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-li21a, title = { Online Forgetting Process for Linear Regression Models }, author = {Li, Yuantong and Wang, Chi-Hua and Cheng, Guang}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {217--225}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/li21a/li21a.pdf}, url = {https://proceedings.mlr.press/v130/li21a.html}, abstract = { Motivated by the EU’s "Right To Be Forgotten" regulation, we initiate a study of statistical data deletion problems where users’ data are accessible only for a limited period of time. This setting is formulated as an online supervised learning task with \textit{constant memory limit}. We propose a deletion-aware algorithm \texttt{FIFD-OLS} for the low dimensional case, and witness a catastrophic rank swinging phenomenon due to the data deletion operation, which leads to statistical inefficiency. As a remedy, we propose the \texttt{FIFD-Adaptive Ridge} algorithm with a novel online regularization scheme, that effectively offsets the uncertainty from deletion. In theory, we provide the cumulative regret upper bound for both online forgetting algorithms. In the experiment, we showed \texttt{FIFD-Adaptive Ridge} outperforms the ridge regression algorithm with fixed regularization level, and hopefully sheds some light on more complex statistical models. } }
Endnote
%0 Conference Paper %T Online Forgetting Process for Linear Regression Models %A Yuantong Li %A Chi-Hua Wang %A Guang Cheng %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-li21a %I PMLR %P 217--225 %U https://proceedings.mlr.press/v130/li21a.html %V 130 %X Motivated by the EU’s "Right To Be Forgotten" regulation, we initiate a study of statistical data deletion problems where users’ data are accessible only for a limited period of time. This setting is formulated as an online supervised learning task with \textit{constant memory limit}. We propose a deletion-aware algorithm \texttt{FIFD-OLS} for the low dimensional case, and witness a catastrophic rank swinging phenomenon due to the data deletion operation, which leads to statistical inefficiency. As a remedy, we propose the \texttt{FIFD-Adaptive Ridge} algorithm with a novel online regularization scheme, that effectively offsets the uncertainty from deletion. In theory, we provide the cumulative regret upper bound for both online forgetting algorithms. In the experiment, we showed \texttt{FIFD-Adaptive Ridge} outperforms the ridge regression algorithm with fixed regularization level, and hopefully sheds some light on more complex statistical models.
APA
Li, Y., Wang, C. & Cheng, G.. (2021). Online Forgetting Process for Linear Regression Models . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:217-225 Available from https://proceedings.mlr.press/v130/li21a.html.

Related Material