\( \newcommand{\E}{{\rm E}} % E expectation operator \newcommand{\Var}{{\rm Var}} % Var variance operator \newcommand{\Cov}{{\rm Cov}} % Cov covariance operator \newcommand{\Cor}{{\rm Corr}} \)

Unknown, varying mean

For this, we need to know how the mean varies. Suppose we model this as a linear regression model in \(p\) known predictors: \[Z(s_i) = \sum_{j=0}^p \beta_j X_j(s_i) + e(s_i)\] \[Z(s) = \sum_{j=0}^p \beta_j X_j(s) + e(s) = X(s)\beta + e(s)\] with \(X(s)\) the matrix with predictors, and row \(i\) and column \(j\) containing \(X_j(s_i)\), and with \(\beta = (\beta_0,…\beta_p)\). Usually, the first column of \(X\) contains zeroes in which case \(\beta_0\) is an intercept.

Predictor: \[\hat{Z}(s_0) = x(s_0)\hat{\beta} + v'V^{-1} (Z-X\hat{\beta}) \] with \(x(s_0) = (X_0(s_0),…,X_p(s_0))\) and \(\hat{\beta} = (X'V^{-1}X)^{-1} X'V^{-1}Z\) it has prediction error variance \[\sigma^2(s_0) = \sigma^2_0 - v'V^{-1}v + Q\] with \(Q = (x(s_0) - X'V^{-1}v)‘(X'V^{-1}X)^{-1}(x(s_0) - X'V^{-1}v)\)

This form is called external drift kriging, universal kriging or sometimes regression kriging.

Example in meuse data set: log(zinc) depending on sqrt(meuse)

library(sp)
cov = function(h) exp(-h)

calc_beta = function(X, V, z) {
    XtVinv = t(solve(V, X))
    solve(XtVinv %*% X, XtVinv %*% z)
}

uk = function(data, newdata, X, x0, cov, beta) {
    library(sp) # spDists
    V = cov(spDists(data))
    v = cov(spDists(data, newdata))
    z = data[[1]]
    if (missing(beta))
        beta = calc_beta(X, V, z)
    mu = X %*% beta
    x0 %*% beta + t(v) %*% solve(V, z - mu)
}

# prediction location at (0,1):
newdata = SpatialPoints(cbind(0,1))

# observation location at (1,1), with attribute value (y) 3:
pts = SpatialPoints(cbind(1,1))
data = SpatialPointsDataFrame(pts, data.frame(z = 3))
x0 = matrix(1, 1, 1)
X = x0
uk(data, newdata, X, x0, cov)
##      [,1]
## [1,]    3
# three observations location, with attribute values (y) 3,2,5:
pts = SpatialPoints(cbind(c(1,2,3),c(1,1,1)))
data = SpatialPointsDataFrame(pts, data.frame(z = c(3,2,5)))
newdata = SpatialPoints(cbind(.1 * 0:20, 1))
X = matrix(1,3,1)
x0 = matrix(1, length(newdata), 1)
uk(data, newdata, X, x0, cov) # mu = unknown
##        [,1]
##  [1,] 3.329
##  [2,] 3.308
##  [3,] 3.286
##  [4,] 3.262
##  [5,] 3.234
##  [6,] 3.204
##  [7,] 3.171
##  [8,] 3.135
##  [9,] 3.094
## [10,] 3.049
## [11,] 3.000
## [12,] 2.936
## [13,] 2.867
## [14,] 2.790
## [15,] 2.707
## [16,] 2.615
## [17,] 2.515
## [18,] 2.404
## [19,] 2.282
## [20,] 2.148
## [21,] 2.000

Plotting them:

newdata = SpatialPoints(cbind(seq(-4,6,by=.1), 1))
x0 = matrix(1, length(newdata), 1)
z_hat = uk(data, newdata, X, x0, cov) # mu unknown
plot(coordinates(newdata)[,1], z_hat, type='l', ylim = c(1,5))
points(as(data, "data.frame")[,c(2,1)], col='red', pch=16)
beta = calc_beta(X, cov(spDists(data)), data[[1]])
beta
##      [,1]
## [1,] 3.52
abline(beta, 0, col = 'blue', lty = 2)

plot of chunk unnamed-chunk-2

# linear trend
X = cbind(matrix(1,3,1), 1:3)
X
##      [,1] [,2]
## [1,]    1    1
## [2,]    1    2
## [3,]    1    3
xcoord = seq(-4,6,by=.1)
newdata = SpatialPoints(cbind(xcoord, 1))
x0 = cbind(matrix(1, length(newdata), 1), xcoord)
z_hat = uk(data, newdata, X, x0, cov) # mu unknown
plot(coordinates(newdata)[,1], z_hat, type='l')
points(as(data, "data.frame")[,c(2,1)], col='red', pch=16)
beta = calc_beta(X, cov(spDists(data)), data[[1]])
beta
##      [,1]
## [1,] 1.52
## [2,] 1.00
abline(beta, col = 'blue', lty = 2)

plot of chunk unnamed-chunk-3

With Gaussian covariance:

cov = function(h) exp(-(h^2))
z_hat = uk(data, newdata, X, x0, cov) # mu unknown
plot(coordinates(newdata)[,1], z_hat, type='l')
points(as(data, "data.frame")[,c(2,1)], col='red', pch=16)
beta = calc_beta(X, cov(spDists(data)), data[[1]])
beta
##       [,1]
## [1,] 1.635
## [2,] 1.000
abline(beta, col = 'blue', lty = 2)

plot of chunk unnamed-chunk-4

Estimating spatial correlation under the UK model

As opposed to the ordinary kriging model, the universal kriging model needs knowledge of the mean vector in order to estimate the semivariance (or covariance) from the residual vector: \[\hat{e}(s) = Z(s) - X\hat\beta\] but how to get \(\hat\beta\) without knowing \(V\)? This is a chicken-egg problem. The simplest, but not best, solution is to plug \(\hat{\beta}_{OLS}\) in, and from the \(e_{OLS}(s)\), estimate \(V\) (i.e., the variogram of \(Z\))

Spatial Prediction

… involves errors, uncertainties

Kriging varieties

UK and linear regression

If \(Z\) has no spatial correlation, all covariances are zero and \(v=0\) and \(V=\mbox{diag}(\sigma^2)\). This implies that \[\hat{Z}(s_0) = x(s_0)\hat{\beta} + v'V^{-1} (Z-X\hat{\beta}) \] with \(\hat{\beta} = (X'V^{-1}X)^{-1} X'V^{-1}Z\) reduces to

\[\hat{Z}(s_0) = x(s_0)\hat{\beta}\] with \(\hat{\beta} = (X'X)^{-1} X'Z\), i.e., ordinary least squares regression.

Note that

Global vs. local predictors

In many cases, instead of using all data, the number of observations used for prediction are limited to a selection of nearest observations, based on

The reason for this is usually either

Statistical arguments for local prediction

Practical arguments for local prediction

Predicting block means

Instead of predicting \(Z(s_0)\) for a “point'' location, one might be interested at predicting the average of \(Z(s)\) over a block, \(B_0\), i.e. \[Z(B_0) = \frac{1}{|B_0|}\int_{B_0} Z(u)du\]

Reason why one wants block means

Examples

cokriging

Cokriging sets the multivariate equivalent of kriging, which is, in terms of number of dependent variables, univariate. Kriging: \[Z(s) = X(s)\beta + e(s)\] Cokriging: \[Z_1(s) = X_1(s)\beta_1 + e_1(s)\] \[Z_2(s) = X_2(s)\beta_2 + e_2(s)\] \[Z_k(s) = X_k(s)\beta_k + e_k(s)\] with \(V = \Cov(e_1,e_2,…,e_k)\)

Cases where this is useful: multiple spatial correlated variables such as

Cokriging prediction

Cokriging prediction is not substantially different from kriging prediction, it is just a lot of book-keeping.

How to set up Z(s), X, beta, e(s), x(s_0), v, V?

Multivariable prediction involves the joint prediction of multiple, both spatially and cross-variable correlated variables. Consider \(m\) distinct variables, and let \(\{Z_i(s), X_i, \beta^i, e_i(s), x_i(s_0), v_i, V_i\}\) correspond to \(\{Z(s), X, \beta, e(s), x(s_0), v, V\}\) of the \(i\)-th variable. Next, let \({\bf Z}(s) = (Z_1(s)’,…,Z_m(s)‘)’\), \({\bf B}=({\beta^1} ‘,…,{\beta^m} ’)‘\), \({\bf e}(s)=(e_1(s)’,…,e_m(s)‘)’\),

\[ {\bf X} = \left[ \begin{array}{cccc} X_1 & 0 & … & 0 \\ 0 & X_2 & … & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & … & X_m \\ \end{array} \right], \ {\bf x}(s_0) = \left[ \begin{array}{cccc} x_1(s_0) & 0 & … & 0 \\ 0 & x_2(s_0) & … & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & … & x_m(s_0) \\ \end{array} \right] \]

with \(0\) conforming zero matrices, and

\[{\bf v} = \left[ \begin{array}{cccc} v_{1,1} & v_{1,2} & … & v_{1,m} \\ v_{2,1} & v_{2,2} & … & v_{2,m} \\ \vdots & \vdots & \ddots & \vdots \\ v_{m,1} & v_{m,2} & … & v_{m,m} \\ \end{array} \right], \ \ {\bf V} = \left[ \begin{array}{cccc} V_{1,1} & V_{1,2} & … & V_{1,m} \\ V_{2,1} & V_{2,2} & … & V_{2,m} \\ \vdots & \vdots & \ddots & \vdots \\ V_{m,1} & V_{m,2} & … & V_{m,m} \\ \end{array} \right] \]

where element \(i\) of \(v_{k,l}\) is \(\Cov(Z_k(s_i), Z_l(s_0))\), and where element \((i,j)\) of \(V_{k,l}\) is \(\Cov(Z_k(s_i),Z_l(s_j))\).

The multivariable prediction equations equal the previous UK equations and when all matrices are substituted by their multivariable forms (see also Ver Hoef and Cressie, Math.Geol., 1993), and when for \(\sigma^2_0\), \(\Sigma\) is substituted with \(\Cov(Z_i(s_0),Z_j(s_0))\) in its \((i,j)\)-th element. Note that the prediction variance is now a prediction error covariance matrix.

What is needed?

The main tool for estimating semivariances between different variables is the cross variogram, defined for collocated data as \[\gamma_{ij}(h) = \mbox{E}[(Z_i(s)-Z_i(s+h))(Z_j(s)-Z_j(s+h))]\] and for non-collocated data as \[\gamma_{ij}(h) = \mbox{E}[(Z_i(s)-m_i)(Z_j(s)-m_j)]\] with \(m_i\) and \(m_j\) the means of the respective variables. Sample cross variograms are the obvious sums over the available pairs or cross pairs, as in one of \[\hat{\gamma}_{jk}(\tilde{h})=\frac{1}{N_h}\sum_{i=1}^{N_h}(Z_j(s_i)-Z_j(s_i+h))(Z_k(s_i)-Z_k(s_i+h))\] \[\hat{\gamma}_{jk}(\tilde{h})=\frac{1}{N_h}\sum_{i=1}^{N_h}(Z_j(s_i)-m_j)(Z_k(s_i+h)-m_k)\]

Permissible cross covariance functions

Two classes of permissible cross covariance (semivariance) functions are often used:

How to do this?

As multivariable analysis may involve numerous variables, we need to start organising the available information. For that reason, we collect all the observation data specifications in a gstat object, created by the function gstat. This function does nothing else than ordering (and actually, copying) information needed later in a single object. Consider the following definitions of four heavy metals:

library(sp)
data(meuse)
coordinates(meuse) = ~x+y
library(gstat)
g <- gstat(NULL, "logCd", log(cadmium)~1, meuse)
g <- gstat(g, "logCu", log(copper)~1, meuse)
g <- gstat(g, "logPb", log(lead)~1, meuse)
g <- gstat(g, "logZn", log(zinc)~1, meuse)
g 
## data:
## logCd : formula = log(cadmium)`~`1 ; data dim = 155 x 12
## logCu : formula = log(copper)`~`1 ; data dim = 155 x 12
## logPb : formula = log(lead)`~`1 ; data dim = 155 x 12
## logZn : formula = log(zinc)`~`1 ; data dim = 155 x 12
vm <- variogram(g)
vm.fit <- fit.lmc(vm, g, vgm(1, "Sph", 800, 1))
plot(vm, vm.fit)

plot of chunk unnamed-chunk-5

Kriging predictions and errors – how good are they?

Cross validation can be used to assess the quality of any interpolation, including kriging. We split the data set in \(n\) parts (folds). For each part, we

Cross validation: what does it yield?

In case the interpolation method yields a prediction error we can compute z-scores: \(r(s_i)/\sigma(s_i)\)

The z-score allows the validation of the kriging error, as the z-score should have mean close to zero and variance close to 1. If the variance of the z-score is larger (smaller) than 1, the kriging standard error is underestimating (overestimating) the true interpolation error, on average.

Kriging errors – what do they mean?

Suppose legislation prescribes remediation in case zinc exceeds 500 ppm. Where does the zinc level exceed 500 ppm?

Conditional probability

How?

out = krige(log(zinc)~1, meuse, meuse.grid, v.fit)
## [using ordinary kriging]
out$p500 = 1 - pnorm(log(500), out$var1.pred, sqrt(out$var1.var))
spplot(out["p500"], col.regions = bpy.colors())

plot of chunk unnamed-chunk-6