Chernoff’s bound

James Zou Probability

[latexpage] If you have some randomness in your life, chances are that you want to try Chernoff’s bound. The most common way to understand randomness is a 2-step combo: find the average behavior, and show that the reality is unlikely to differ too much from the expectation (via Chernoff’s bound or its cousins). My favorite form of Chernoff’s bound is: for $X_1, …, X_n$ independent binary random variables, and $X=\sum_{i=1}^n X_i$, $\mu = E[X]$ and $\delta > 2e-1$, then \[ P(X > (1+\delta)\mu) < 2^{-\delta \mu}. \] Note that $X_i’s$ are not necessarily identically distributed, they just have to be independent. In practice, we often care about significant deviation from the mean, so $\delta$ is typically larger than $2e-1$. In …

Method of moments

James Zou Machine Learning Leave a Comment

[latexpage] The method of moments is a simple idea for estimating the parameters of a model. Suppose the observed data are sampled iid from a distribution $p(x|\theta)$, our goal is to estimate the parameter $\theta$. If we have enough data, we can get very close to the true mean of the distribution, $E[x]$, which is a ¬†function of $\theta$: $E[x]=f_1(\theta)$. We know the form of $f_1$ since we know the form of $p(x|\theta)$. For simple distributions, knowing just the mean is enough to invert $f_1$ to obtain $\theta$. In general, we need to calculate the higher moments, which are also known functions of $\theta$: $E[x^2] = f_2(\theta), …,¬†E[x^n]=f_n(\theta)$. We then try to invert the systems of equations $f_1, …, f_n$ …