Can SGD Select Good Fishermen? Local Convergence under Self-Selection Biases (Extended Abstract)

Alkis Kalavasis; Anay Mehrotra; Felix Zhou

Can SGD Select Good Fishermen? Local Convergence under Self-Selection Biases (Extended Abstract)

Alkis Kalavasis, Anay Mehrotra, Felix Zhou

Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:3904-3905, 2026.

Abstract

We revisit the problem of estimating $k$ linear regressors in $d$ dimensions from samples affected by self-selection bias under the maximum selection rule. Our main result is an algorithm with sample complexity $O(d)\cdot \operatorname{poly}(k,1/\varepsilon)$ and running time $\operatorname{poly}(d,k,1/\varepsilon)+(k\log k)^{O(k)}$ for recovering the regressors up to joint squared error $\varepsilon^2$. The key ingredient is the first local-convergence algorithm for the maximum self-selection model. Our approach reduces self-selection to estimation from coarse observations, where the learner observes only the cell of a partition containing the latent sample. The self-selection reduction induces a structured non-convex partition. We prove that this partition preserves enough information locally and that the resulting negative log-likelihood is locally convex around the true parameters. These two geometric properties allow projected stochastic gradient descent, initialized from an existing warm start, to obtain the stated end-to-end guarantee.

Cite this Paper

BibTeX

@InProceedings{pmlr-v336-kalavasis26a,
  title = 	 {Can SGD Select Good Fishermen? Local Convergence under Self-Selection Biases (Extended Abstract)},
  author =       {Kalavasis, Alkis and Mehrotra, Anay and Zhou, Felix},
  booktitle = 	 {Proceedings of Thirty Ninth Conference on Learning Theory},
  pages = 	 {3904--3905},
  year = 	 {2026},
  editor = 	 {Hanneke, Steve and Lattimore, Tor},
  volume = 	 {336},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v336/main/assets/kalavasis26a/kalavasis26a.pdf},
  url = 	 {https://proceedings.mlr.press/v336/kalavasis26a.html},
  abstract = 	 {We revisit the problem of estimating $k$ linear regressors in $d$ dimensions from samples affected by self-selection bias under the maximum selection rule. Our main result is an algorithm with sample complexity $O(d)\cdot \operatorname{poly}(k,1/\varepsilon)$ and running time $\operatorname{poly}(d,k,1/\varepsilon)+(k\log k)^{O(k)}$ for recovering the regressors up to joint squared error $\varepsilon^2$. The key ingredient is the first local-convergence algorithm for the maximum self-selection model. Our approach reduces self-selection to estimation from coarse observations, where the learner observes only the cell of a partition containing the latent sample. The self-selection reduction induces a structured non-convex partition. We prove that this partition preserves enough information locally and that the resulting negative log-likelihood is locally convex around the true parameters. These two geometric properties allow projected stochastic gradient descent, initialized from an existing warm start, to obtain the stated end-to-end guarantee.}
}

Endnote

%0 Conference Paper
%T Can SGD Select Good Fishermen? Local Convergence under Self-Selection Biases (Extended Abstract)
%A Alkis Kalavasis
%A Anay Mehrotra
%A Felix Zhou
%B Proceedings of Thirty Ninth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Steve Hanneke
%E Tor Lattimore	
%F pmlr-v336-kalavasis26a
%I PMLR
%P 3904--3905
%U https://proceedings.mlr.press/v336/kalavasis26a.html
%V 336
%X We revisit the problem of estimating $k$ linear regressors in $d$ dimensions from samples affected by self-selection bias under the maximum selection rule. Our main result is an algorithm with sample complexity $O(d)\cdot \operatorname{poly}(k,1/\varepsilon)$ and running time $\operatorname{poly}(d,k,1/\varepsilon)+(k\log k)^{O(k)}$ for recovering the regressors up to joint squared error $\varepsilon^2$. The key ingredient is the first local-convergence algorithm for the maximum self-selection model. Our approach reduces self-selection to estimation from coarse observations, where the learner observes only the cell of a partition containing the latent sample. The self-selection reduction induces a structured non-convex partition. We prove that this partition preserves enough information locally and that the resulting negative log-likelihood is locally convex around the true parameters. These two geometric properties allow projected stochastic gradient descent, initialized from an existing warm start, to obtain the stated end-to-end guarantee.

APA

Kalavasis, A., Mehrotra, A. & Zhou, F.. (2026). Can SGD Select Good Fishermen? Local Convergence under Self-Selection Biases (Extended Abstract). Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:3904-3905 Available from https://proceedings.mlr.press/v336/kalavasis26a.html.

Related Material

Download PDF