U-statistics on network-structured data with kernels of degree larger than one
Proceedings of the Workshop on Statistically Sound Data Mining at ECML/PKDD, PMLR 47:37-48, 2015.
Most analysis of U-statistics assumes that data points are independent or stationary. However, when we analyze network data, these two assumptions do not hold any more. We first define the problem of weighted U-statistics on networked data by extending previous work. We analyze their variance using Hoeffding’s decomposition and also give exponential concentration inequalities. Two efficiently solvable linear programs are proposed to find estimators with minimum worst-case variance or with tighter concentration inequalities.