The Central Limit Theorem

Robert Nishihara Probability, Statistics

[latexpage]The proof and intuition presented here come from this excellent writeup by Yuval Filmus, which in turn draws upon ideas in this book by Fumio Hiai and Denes Petz. Suppose that we have a sequence of real-valued random variables \begin{equation} X_1, X_2, \ldots . \end{equation} Define the random variable \begin{equation} A_N = \frac{X_1 + \cdots + X_N}{\sqrt{N}} \end{equation} to be a scaled sum of the first $N$ variables in the sequence. Now, we would like to make interesting statements about the sequence \begin{equation} A_1, A_2, \ldots . \end{equation}

A Geometric Intuition for Markov’s Inequality

Peter Krafft Probability Leave a Comment

[latexpage] As the title of the post suggests, this week I will discuss a geometric intuition for Markov’s inequality, which for a nonnegative random variable, $X$, states $$ P(X \geq a) \leq E[X]/a. $$ This is a simple result in basic probability that still felt surprising every time I used it… until very recently. (Warning: Basic measure theoretic probability lies ahead. These notes look like they provide sufficient background if this post is confusing and you are sufficiently motivated!)

Correlation and Mutual Information

Peter Krafft Statistics Leave a Comment

[latexpage] Mutual information is a quantification of the dependency between random variables. It is sometimes contrasted with linear correlation since mutual information captures nonlinear dependence. In this short note I will discuss the relationship between these quantities in the case of a bivariate Gaussian distribution, and I will explore two implications of that relationship.

Asymptotic Equipartition of Markov Chains

Peter Krafft Statistics Leave a Comment

[latexpage] The Asymptotic Equipartition Property/Principle (AEP) is a well-known result that is likely covered in any introductory information theory class. Nevertheless, when I first learned about it in such a course, I did not appreciate the implications of its general form.  In this post I will review this beautiful, classic result and offer the mental picture I have of its implications. I will frame my discussion in terms of Markov chains with discrete state spaces, but note that the AEP holds even more generally. My treatment will be relatively informal, and I will assume basic familiarity with Markov chains. See the references for more details.