Mutual information is a quantification of the dependency between random variables. It is sometimes contrasted with linear correlation since mutual information captures nonlinear dependence. In this short note I will discuss the relationship between these quantities in the case of a bivariate Gaussian distribution, and I will explore two implications of that relationship.
As shown below, in the case of and having a bivariate Normal distribution, mutual information is a monotonic transformation of correlation,
This relationship has a couple implications that I would like to highlight. First, it proves that lack of correlation in the bivariate Normal distribution implies independence, i.e. when the correlation between and is zero, the mutual information will also be zero. More interestingly, if you are willing to assume that the marginal distributions of and are Normal but you are not willing to assume joint normality, this result provides a lower bound on the mutual information between and . This lower bound follows from the maximum entropy property of the Normal distribution.
Formally, we have that the entropy of a univariate Gaussian random variable, ,
For a bivariate Gaussian random variable, ,
where is the covariance of and .
Then the mutual information,
The lower bound follows from the following argument. Consider two other random variables, and , that have the same covariance as and but are jointly normally distributed. Note that since we have assumed that the marginal distributions of and are Normal, we have and , and by the maximum entropy property of the Normal distribution, we have . The result is then straightforward:
Thanks are due to Vikash Mansinghka for suggesting the first part of this exercise.