Chernoff’s bound

[latexpage]

If you have some randomness in your life, chances are that you want to try Chernoff’s bound. The most common way to understand randomness is a 2-step combo: find the average behavior, and show that the reality is unlikely to differ too much from the expectation (via Chernoff’s bound or its cousins).

My favorite form of Chernoff’s bound is: for $X_1, …, X_n$ independent binary random variables, and $X=\sum_{i=1}^n X_i$, $\mu = E[X]$ and $\delta > 2e-1$, then

\[

P(X > (1+\delta)\mu) < 2^{-\delta \mu}.

\]

Note that $X_i’s$ are not necessarily identically distributed, they just have to be independent. In practice, we often care about significant deviation from the mean, so $\delta$ is typically larger than $2e-1$.

In the standard applications, the stochastic system has size $N$ and an event of interest, $X$, has expectation $\mu=O(\log N)$. The probability that any one event deviates significantly is inverse polynomial in the size of the whole system

\[

P(X>(1+\delta)\mu) < \frac{1}{N^{\delta}}.

\]

This is important since the total number of events is polynomial in $N$ for many settings. By simple Union Bound,

\[

P(\mbox{any } X>(1+\delta)\mu) = O(\frac{1}{N}).

\]

So the chance of any deviation in any event is small!

 

 

Method of moments

[latexpage]

The method of moments is a simple idea for estimating the parameters of a model. Suppose the observed data are sampled iid from a distribution $p(x|\theta)$, our goal is to estimate the parameter $\theta$. If we have enough data, we can get very close to the true mean of the distribution, $E[x]$, which is a  function of $\theta$: $E[x]=f_1(\theta)$. We know the form of $f_1$ since we know the form of $p(x|\theta)$.

For simple distributions, knowing just the mean is enough to invert $f_1$ to obtain $\theta$. In general, we need to calculate the higher moments, which are also known functions of $\theta$: $E[x^2] = f_2(\theta), …,¬†E[x^n]=f_n(\theta)$. We then try to invert the systems of equations $f_1, …, f_n$ to obtain $\theta$. In practice, we typically only have enough data to accurately estimate the low order ($\leq 3$) moments. Continue reading “Method of moments”