Exploring and measuring non-linear correlations: Copulas, Lightspeed Transportation and Clustering


Gautier Marti, Sébastien Andler, Frank Nielsen, Philippe Donnat ;
Proceedings of the Time Series Workshop at NIPS 2016, PMLR 55:59-69, 2017.


We propose a methodology to explore and measure the pairwise correlations that exist between variables in a dataset. The methodology leverages copulas for encoding dependence between two variables, state-of-the-art optimal transport for providing a relevant geometry to the copulas, and clustering for summarizing the main dependence patterns found between the variables. Some of the clusters centers can be used to parameterize a novel dependence coefficient which can target or forget specific dependence patterns. Finally, we illustrate and benchmark the methodology on several datasets. Code and numerical experiments are available online at https://www.datagrapple.com/Tech for reproducible research.

Related Material