Suppose we are modeling a spatial process (for instance, the amount of rainfall around the world, the distribution of natural resources, or the population density of an endangered species). We've measured the latent function at some locations , and we'd like to predict the function's value at some new location . Kriging is a technique for extrapolating our measurements to arbitrary locations. For an in-depth discussion, see Cressie and Wikle (2011). Here I derive Kriging in a simplified case.
I will assume that is an intrinsically stationary process. In other words, there exists some semivariogram such that
Furthermore, I will assume that the process is isotropic, (i.e. that is a function only of ). As Andy described here, the existence of a covariance function implies intrinsic stationarity. In addition, I will assume that the process has a constant mean, . We would like to estimate with a linear combination of our current observations. Our estimator will be
where the weights can be positive or negative. We further require that so that our estimate is unbiased. We would like to choose the weights so as to minimize the mean-squared predictive error
Let denote . Expanding the expression for the mean-squared predictive error, we get
Adding and subtracting , this expression breaks into , where
and
We minimize the quantity subject to the constraint using Lagrange multipliers. To simplify the notation, define the matrix by , the vector , the vector , and the vector . Then the mean-squared predictive error is given by
Incorporating the Lagrange multiplier constraint, we have the quantity
Differentiating with respect to gives back our constraint. Differentiating with respect to each and concatenating the resulting equations into matrix form gives
Incorporating the constraint gives
Solving for and plugging this back into our formula for , we find that
This gives us our optimal Kriging predictor.