[edit]
Balancing Gaussian vectors in high dimension
Proceedings of Thirty Third Conference on Learning Theory, PMLR 125:3455-3486, 2020.
Abstract
Motivated by problems in controlled experiments, we study the discrepancy of random matrices with continuous entries where the number of columns n is much larger than the number of rows m. Our first result shows that if ω(1)=m=o(n), a matrix with i.i.d. standard Gaussian entries has discrepancy Θ(√n2−n/m) with high probability. This provides sharp guarantees for Gaussian discrepancy in a regime that had not been considered before in the existing literature. Our results also apply to a more general family of random matrices with continuous i.i.d. entries, assuming that m=O(n/logn). The proof is non-constructive and is an application of the second moment method. Our second result is algorithmic and applies to random matrices whose entries are i.i.d. and have a Lipschitz density. We present a randomized polynomial-time algorithm that achieves discrepancy e−Ω(log2(n)/m) with high probability, provided that m=O(√logn). In the one-dimensional case, this matches the best known algorithmic guarantees due to Karmarkar–Karp. For higher dimensions 2≤m=O(√logn), this establishes the first efficient algorithm achieving discrepancy smaller than O(√m).