Correlation and Mutual Information
Mutual information is a quantification of the dependency between random variables. It is sometimes contrasted with linear correlation since mutual information captures nonlinear dependence. In this short note I will discuss the relationship between these quantities in the case of a bivariate Gaussian distribution, and I will explore two implications of that relationship.
As shown below, in the case of and
having a bivariate Normal distribution, mutual information is a monotonic transformation of correlation,
This relationship has a couple implications that I would like to highlight. First, it proves that lack of correlation in the bivariate Normal distribution implies independence, i.e. when the correlation between and
is zero, the mutual information will also be zero. More interestingly, if you are willing to assume that the marginal distributions of
and
are Normal but you are not willing to assume joint normality, this result provides a lower bound on the mutual information between
and
. This lower bound follows from the maximum entropy property of the Normal distribution.
Formally, we have that the entropy of a univariate Gaussian random variable, ,
For a bivariate Gaussian random variable, ,
where is the covariance of
and
.
Then the mutual information,
The lower bound follows from the following argument. Consider two other random variables, and
, that have the same covariance as
and
but are jointly normally distributed. Note that since we have assumed that the marginal distributions of
and
are Normal, we have
and
, and by the maximum entropy property of the Normal distribution, we have
. The result is then straightforward:
Thanks are due to Vikash Mansinghka for suggesting the first part of this exercise.