\( \newcommand{\E}{{\rm E}} % E expectation operator \newcommand{\Var}{{\rm Var}} % Var variance operator \newcommand{\Cov}{{\rm Cov}} % Cov covariance operator \newcommand{\Cor}{{\rm Corr}} \)
For this, we need to know how the mean varies. Suppose we model this as a linear regression model in \(p\) known predictors: \[Z(s_i) = \sum_{j=0}^p \beta_j X_j(s_i) + e(s_i)\] \[Z(s) = \sum_{j=0}^p \beta_j X_j(s) + e(s) = X(s)\beta + e(s)\] with \(X(s)\) the matrix with predictors, and row \(i\) and column \(j\) containing \(X_j(s_i)\), and with \(\beta = (\beta_0,…\beta_p)\). Usually, the first column of \(X\) contains zeroes in which case \(\beta_0\) is an intercept.
Predictor: \[\hat{Z}(s_0) = x(s_0)\hat{\beta} + v'V^{-1} (Z-X\hat{\beta}) \] with \(x(s_0) = (X_0(s_0),…,X_p(s_0))\) and \(\hat{\beta} = (X'V^{-1}X)^{-1} X'V^{-1}Z\) it has prediction error variance \[\sigma^2(s_0) = \sigma^2_0 - v'V^{-1}v + Q\] with \(Q = (x(s_0) - X'V^{-1}v)‘(X'V^{-1}X)^{-1}(x(s_0) - X'V^{-1}v)\)
This form is called external drift kriging, universal kriging or sometimes regression kriging.
Example in meuse
data set: log(zinc)
depending on
sqrt(meuse)
library(sp)
cov = function(h) exp(-h)
calc_beta = function(X, V, z) {
XtVinv = t(solve(V, X))
solve(XtVinv %*% X, XtVinv %*% z)
}
uk = function(data, newdata, X, x0, cov, beta) {
library(sp) # spDists
V = cov(spDists(data))
v = cov(spDists(data, newdata))
z = data[[1]]
if (missing(beta))
beta = calc_beta(X, V, z)
mu = X %*% beta
x0 %*% beta + t(v) %*% solve(V, z - mu)
}
# prediction location at (0,1):
newdata = SpatialPoints(cbind(0,1))
# observation location at (1,1), with attribute value (y) 3:
pts = SpatialPoints(cbind(1,1))
data = SpatialPointsDataFrame(pts, data.frame(z = 3))
x0 = matrix(1, 1, 1)
X = x0
uk(data, newdata, X, x0, cov)
## [,1]
## [1,] 3
# three observations location, with attribute values (y) 3,2,5:
pts = SpatialPoints(cbind(c(1,2,3),c(1,1,1)))
data = SpatialPointsDataFrame(pts, data.frame(z = c(3,2,5)))
newdata = SpatialPoints(cbind(.1 * 0:20, 1))
X = matrix(1,3,1)
x0 = matrix(1, length(newdata), 1)
uk(data, newdata, X, x0, cov) # mu = unknown
## [,1]
## [1,] 3.329
## [2,] 3.308
## [3,] 3.286
## [4,] 3.262
## [5,] 3.234
## [6,] 3.204
## [7,] 3.171
## [8,] 3.135
## [9,] 3.094
## [10,] 3.049
## [11,] 3.000
## [12,] 2.936
## [13,] 2.867
## [14,] 2.790
## [15,] 2.707
## [16,] 2.615
## [17,] 2.515
## [18,] 2.404
## [19,] 2.282
## [20,] 2.148
## [21,] 2.000
Plotting them:
newdata = SpatialPoints(cbind(seq(-4,6,by=.1), 1))
x0 = matrix(1, length(newdata), 1)
z_hat = uk(data, newdata, X, x0, cov) # mu unknown
plot(coordinates(newdata)[,1], z_hat, type='l', ylim = c(1,5))
points(as(data, "data.frame")[,c(2,1)], col='red', pch=16)
beta = calc_beta(X, cov(spDists(data)), data[[1]])
beta
## [,1]
## [1,] 3.52
abline(beta, 0, col = 'blue', lty = 2)
# linear trend
X = cbind(matrix(1,3,1), 1:3)
X
## [,1] [,2]
## [1,] 1 1
## [2,] 1 2
## [3,] 1 3
xcoord = seq(-4,6,by=.1)
newdata = SpatialPoints(cbind(xcoord, 1))
x0 = cbind(matrix(1, length(newdata), 1), xcoord)
z_hat = uk(data, newdata, X, x0, cov) # mu unknown
plot(coordinates(newdata)[,1], z_hat, type='l')
points(as(data, "data.frame")[,c(2,1)], col='red', pch=16)
beta = calc_beta(X, cov(spDists(data)), data[[1]])
beta
## [,1]
## [1,] 1.52
## [2,] 1.00
abline(beta, col = 'blue', lty = 2)
With Gaussian covariance:
cov = function(h) exp(-(h^2))
z_hat = uk(data, newdata, X, x0, cov) # mu unknown
plot(coordinates(newdata)[,1], z_hat, type='l')
points(as(data, "data.frame")[,c(2,1)], col='red', pch=16)
beta = calc_beta(X, cov(spDists(data)), data[[1]])
beta
## [,1]
## [1,] 1.635
## [2,] 1.000
abline(beta, col = 'blue', lty = 2)
As opposed to the ordinary kriging model, the universal kriging model needs knowledge of the mean vector in order to estimate the semivariance (or covariance) from the residual vector: \[\hat{e}(s) = Z(s) - X\hat\beta\] but how to get \(\hat\beta\) without knowing \(V\)? This is a chicken-egg problem. The simplest, but not best, solution is to plug \(\hat{\beta}_{OLS}\) in, and from the \(e_{OLS}(s)\), estimate \(V\) (i.e., the variogram of \(Z\))
… involves errors, uncertainties
OK: linear predictor \(\lambda'Z\) with \(\lambda\) such that it
UK: \[\hat{Z}(s_0) = x(s_0)\hat{\beta} + v'V^{-1} (Z-X\hat{\beta}) \] with \(x(s_0) = (X_0(s_0),…,X_p(s_0))\) and \(\hat{\beta} = (X'V^{-1}X)^{-1} X'V^{-1}Z\) \[\sigma^2(s_0) = \sigma^2_0 - v'V^{-1}v + Q\] with \(Q = (x(s_0) - X'V^{-1}v)’(X'V^{-1}X)^{-1}(x(s_0) - X'V^{-1}v)\)
OK: fill in a column vector with ones for \(X\): \(X=(1,1,…,1)‘\) and \(X_0=1\)
SK: take out the trend/unknown mean
If \(Z\) has no spatial correlation, all covariances are zero and \(v=0\) and \(V=\mbox{diag}(\sigma^2)\). This implies that \[\hat{Z}(s_0) = x(s_0)\hat{\beta} + v'V^{-1} (Z-X\hat{\beta}) \] with \(\hat{\beta} = (X'V^{-1}X)^{-1} X'V^{-1}Z\) reduces to
\[\hat{Z}(s_0) = x(s_0)\hat{\beta}\] with \(\hat{\beta} = (X'X)^{-1} X'Z\), i.e., ordinary least squares regression.
Note that
In many cases, instead of using all data, the number of observations used for prediction are limited to a selection of nearest observations, based on
The reason for this is usually either
gstat
uses quadtrees/octtrees, inspired by
http://donar.umiacs.umd.edu/quadtree/index.htmlInstead of predicting \(Z(s_0)\) for a “point'' location, one might be interested at predicting the average of \(Z(s)\) over a block, \(B_0\), i.e. \[Z(B_0) = \frac{1}{|B_0|}\int_{B_0} Z(u)du\]
Examples
Cokriging sets the multivariate equivalent of kriging, which is, in terms of number of dependent variables, univariate. Kriging: \[Z(s) = X(s)\beta + e(s)\] Cokriging: \[Z_1(s) = X_1(s)\beta_1 + e_1(s)\] \[Z_2(s) = X_2(s)\beta_2 + e_2(s)\] \[Z_k(s) = X_k(s)\beta_k + e_k(s)\] with \(V = \Cov(e_1,e_2,…,e_k)\)
Cases where this is useful: multiple spatial correlated variables such as
Cokriging prediction is not substantially different from kriging prediction, it is just a lot of book-keeping.
How to set up Z(s), X, beta, e(s), x(s_0), v, V?
Multivariable prediction involves the joint prediction of multiple, both spatially and cross-variable correlated variables. Consider \(m\) distinct variables, and let \(\{Z_i(s), X_i, \beta^i, e_i(s), x_i(s_0), v_i, V_i\}\) correspond to \(\{Z(s), X, \beta, e(s), x(s_0), v, V\}\) of the \(i\)-th variable. Next, let \({\bf Z}(s) = (Z_1(s)’,…,Z_m(s)‘)’\), \({\bf B}=({\beta^1} ‘,…,{\beta^m} ’)‘\), \({\bf e}(s)=(e_1(s)’,…,e_m(s)‘)’\),
\[ {\bf X} = \left[ \begin{array}{cccc} X_1 & 0 & … & 0 \\ 0 & X_2 & … & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & … & X_m \\ \end{array} \right], \ {\bf x}(s_0) = \left[ \begin{array}{cccc} x_1(s_0) & 0 & … & 0 \\ 0 & x_2(s_0) & … & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & … & x_m(s_0) \\ \end{array} \right] \]
with \(0\) conforming zero matrices, and
\[{\bf v} = \left[ \begin{array}{cccc} v_{1,1} & v_{1,2} & … & v_{1,m} \\ v_{2,1} & v_{2,2} & … & v_{2,m} \\ \vdots & \vdots & \ddots & \vdots \\ v_{m,1} & v_{m,2} & … & v_{m,m} \\ \end{array} \right], \ \ {\bf V} = \left[ \begin{array}{cccc} V_{1,1} & V_{1,2} & … & V_{1,m} \\ V_{2,1} & V_{2,2} & … & V_{2,m} \\ \vdots & \vdots & \ddots & \vdots \\ V_{m,1} & V_{m,2} & … & V_{m,m} \\ \end{array} \right] \]
where element \(i\) of \(v_{k,l}\) is \(\Cov(Z_k(s_i), Z_l(s_0))\), and where element \((i,j)\) of \(V_{k,l}\) is \(\Cov(Z_k(s_i),Z_l(s_j))\).
The multivariable prediction equations equal the previous UK equations and when all matrices are substituted by their multivariable forms (see also Ver Hoef and Cressie, Math.Geol., 1993), and when for \(\sigma^2_0\), \(\Sigma\) is substituted with \(\Cov(Z_i(s_0),Z_j(s_0))\) in its \((i,j)\)-th element. Note that the prediction variance is now a prediction error covariance matrix.
The main tool for estimating semivariances between different variables is the cross variogram, defined for collocated data as \[\gamma_{ij}(h) = \mbox{E}[(Z_i(s)-Z_i(s+h))(Z_j(s)-Z_j(s+h))]\] and for non-collocated data as \[\gamma_{ij}(h) = \mbox{E}[(Z_i(s)-m_i)(Z_j(s)-m_j)]\] with \(m_i\) and \(m_j\) the means of the respective variables. Sample cross variograms are the obvious sums over the available pairs or cross pairs, as in one of \[\hat{\gamma}_{jk}(\tilde{h})=\frac{1}{N_h}\sum_{i=1}^{N_h}(Z_j(s_i)-Z_j(s_i+h))(Z_k(s_i)-Z_k(s_i+h))\] \[\hat{\gamma}_{jk}(\tilde{h})=\frac{1}{N_h}\sum_{i=1}^{N_h}(Z_j(s_i)-m_j)(Z_k(s_i+h)-m_k)\]
Two classes of permissible cross covariance (semivariance) functions are often used:
As multivariable analysis may involve numerous variables, we need to
start organising the available information. For that reason, we collect all the
observation data specifications in a gstat
object, created by
the function gstat
. This function does nothing else than ordering
(and actually, copying) information needed later in a single object.
Consider the following definitions of four heavy metals:
library(sp)
data(meuse)
coordinates(meuse) = ~x+y
library(gstat)
g <- gstat(NULL, "logCd", log(cadmium)~1, meuse)
g <- gstat(g, "logCu", log(copper)~1, meuse)
g <- gstat(g, "logPb", log(lead)~1, meuse)
g <- gstat(g, "logZn", log(zinc)~1, meuse)
g
## data:
## logCd : formula = log(cadmium)`~`1 ; data dim = 155 x 12
## logCu : formula = log(copper)`~`1 ; data dim = 155 x 12
## logPb : formula = log(lead)`~`1 ; data dim = 155 x 12
## logZn : formula = log(zinc)`~`1 ; data dim = 155 x 12
vm <- variogram(g)
vm.fit <- fit.lmc(vm, g, vgm(1, "Sph", 800, 1))
plot(vm, vm.fit)
Cross validation can be used to assess the quality of any interpolation, including kriging. We split the data set in \(n\) parts (folds). For each part, we
In case the interpolation method yields a prediction error we can compute z-scores: \(r(s_i)/\sigma(s_i)\)
The z-score allows the validation of the kriging error, as the z-score should have mean close to zero and variance close to 1. If the variance of the z-score is larger (smaller) than 1, the kriging standard error is underestimating (overestimating) the true interpolation error, on average.
Suppose legislation prescribes remediation in case zinc exceeds 500 ppm. Where does the zinc level exceed 500 ppm?
we can use e.g. the normal distribution (on the log-scale?) to assess the conditional probability \(\Pr(Z(s_0) > 500 | z(s_1),…,z(s_n))\)
the additional assumption underlying this is multivariate normality: in addition to having stationary mean and covariance, the field \(Z\) is now assumed to follow a stationary, multivariate normal distribution. This means that any single \(Z(s_i)\) follows a normal distribution, and any pair \(Z(s_i), Z(s_j)\) follows a bivariate normal distribution, with known variances and covariance.
How?
out = krige(log(zinc)~1, meuse, meuse.grid, v.fit)
## [using ordinary kriging]
out$p500 = 1 - pnorm(log(500), out$var1.pred, sqrt(out$var1.var))
spplot(out["p500"], col.regions = bpy.colors())