Complete statistical theory of learning: learning using statistical invariants
Proceedings of the Ninth Symposium on Conformal and Probabilistic Prediction and Applications, PMLR 128:4-40, 2020.
Statistical theory of learning considers methods of constructing approximations that converge to the desired function with increasing number of observations. This theory studies mechanisms that provide convergence in the space of functions in $L_2$ norm, i.e., it studies the so-called strong mode of convergence. However, in Hilbert space, along with the convergence in the space of functions, there also exists the so-called weak mode of convergence, i.e., convergence in the space of functionals. Under some conditions, this weak mode of convergence also implies the convergence of approximations to the desired function in $L_2$ norm, although such convergence is based on other mechanisms. The paper discusses new learning methods which use both modes of convergence (weak and strong) simultaneously. Such methods allow one to execute the following: (1) select an admissible subset of functions (i.e., the set of appropriate approximation functions), and (2) find the desired approximation in this admissible subset. Since only two modes of convergence exist in Hilbert space, we call the theory that uses both modes the complete statistical theory of learning. Along with general reasoning, we describe new learning algorithms referred to as Learning Using Statistical Invariants (LUSI). LUSI algorithms were developed for sets of functions belonging to Reproducing Kernel Hilbert Space (RKHS); they include the modified SVM method (LUSI-SVM method). Also, the paper presents a LUSI modification of Neural Networks (LUSI-NN). LUSI methods require fewer training examples that standard approaches for achieving the same performance. In conclusion, the paper discusses the general (philosophical) framework of a new learn- ing paradigm that includes the concept of intelligence.