Suppose we are modeling a spatial process (for instance, the amount of rainfall around the world, the distribution of natural resources, or the population density of an endangered species). We’ve measured the latent function at some locations , and we’d like to predict the function’s value at some new location . Kriging is a technique for extrapolating our measurements to arbitrary locations. For an in-depth discussion, see Cressie and Wikle (2011). Here I derive Kriging in a simplified case.

I will assume that is an intrinsically stationary process. In other words, there exists some semivariogram such that

Furthermore, I will assume that the process is isotropic, (i.e. that is a function only of ). As Andy described here, the existence of a covariance function implies intrinsic stationarity. In addition, I will assume that the process has a constant mean, . We would like to estimate with a linear combination of our current observations. Our estimator will be

where the weights can be positive or negative. We further require that so that our estimate is unbiased. We would like to choose the weights so as to minimize the mean-squared predictive error

Let denote . Expanding the expression for the mean-squared predictive error, we get

Adding and subtracting , this expression breaks into , where

and

We minimize the quantity subject to the constraint using Lagrange multipliers. To simplify the notation, define the matrix by , the vector , the vector , and the vector . Then the mean-squared predictive error is given by

Incorporating the Lagrange multiplier constraint, we have the quantity

Differentiating with respect to gives back our constraint. Differentiating with respect to each and concatenating the resulting equations into matrix form gives

Incorporating the constraint gives

Solving for and plugging this back into our formula for , we find that

This gives us our optimal Kriging predictor.