Complexity-Regularized Tree-Structured Partition for Mutual Information Estimation

Abstract

A new histogram-based mutual information estimator using data-driven tree-structured partitions (TSP) is presented in this paper. The derived TSP is a solution to a complexity regularized empirical information maximization, with the objective of finding a good tradeoff between the known estimation and approximation errors. A distribution-free concentration inequality for this tree-structured learning problem as well as finite sample performance bounds for the proposed histogram-based solution is derived. It is shown that this solution is density-free strongly consistent and that it provides, with an arbitrary high probability, an optimal balance between the mentioned estimation and approximation errors. Finally, for the emblematic scenario of independence, $I(X;Y)=0$,it is shown that the TSP estimate converges to zero with $O(e^{-n^{13}+\log \log n})$.

Avatar
Jorge F. Silva
Professor of Electrical Engineering

My research interests include learning, information theory, estimation and detection, universal source coding, and compressed sensing.