# «Generated using version 3.0 of the oﬃcial AMS L TEX template A Comments on “Reconstructing the NH mean temperature: Can underestimation of trends ...»

Generated using version 3.0 of the oﬃcial AMS L TEX template

A

Comments on “Reconstructing the NH mean temperature: Can

underestimation of trends and variability be avoided?”

Martin P. Tingley∗

National Center for Atmospheric Research, Boulder, Colorado

Bo Li

Department of Statistics, Purdue University, West Lafayette, Indiana

Corresponding author address: Martin P. Tingley, National Center for Atmospheric Research, 1850 Table

∗

Mesa Drive, Boulder, CO 80305 and Department of Earth and Planetary Sciences, Harvard University, 12 Oxford Street, Cambridge, MA 02138.

E-mail: tingley@fas.harvard.edu ABSTRACT In a recent paper, Bo Christiansen presents and discusses ‘LOC,’ a methodology for reconstructing past climate that is based on local regressions between climate proxy time series and instrumental time series (Christiansen 2010, hereafter C2010). LOC respects two important scientiﬁc facts about proxy data which are often overlooked: 1) many proxies are likely inﬂuenced by strictly local temperature, and 2) to reﬂect causality, the proxies must be written as functions of climate, not vice versa. There are, however, several shortcomings to the LOC method: uncertainty is not propagated through the multiple stages of the analysis, the eﬀects of observational errors in the instrumental observations are not considered, and as the proxies become uninformative of climate, the variance of a reconstruction produced by LOC becomes unbounded - a result which is clearly unphysical.

This comment interprets the LOC method in the context of recently proposed Bayesian Hierarchical reconstruction methods, and describes how LOC can be derived as a frequentist implementation of a special case of two previously published methods: that described in Li et al. (2010) and the BARCAST approach of Tingley and Huybers (2010a). Recasting LOC within a Bayesian hierarchical framework allows for an inference scheme that propagates uncertainty through the multiple stages of the model. In addition, prior information about the target temperature process can be included in the hierarchical model, and doing so insures that the variance of the reconstruction remains bounded in the limit as the proxies become uninformative of the target process.

1. Introduction The key modeling assumption subsumed under the LOC heading is, Pij = αi + λi · Tij + ǫij (1) where Pij and Tij are the proxy and true temperature values at location i and time j, and the errors ǫij are assumed to be independent and identically distributed(IID) normal for each location, with variance σi. In other words, there is a linear relation between each proxy series and the local temperature series, with the latter treated as the independent variable.

LOC makes the additional assumption that the instrumental series are free of observational errors.

In terms of inference tools, the parameters αi and λi are inferred using ordinary least squares regression over a calibration interval. The resulting maximum likelihood estimates of the parameters are then used in the second stage of the analysis to impute past temperatures via (cf. Eq.(A6) from C2010),

In the third and ﬁnal stage of the analysis, the inferred local temperature series are combined into a regional or global average by taking a weighted mean of the reconstructed time series, with weights proportional to the cosines of the latitudes of the proxies.

In this comment, we describe a number of shortcomings to the LOC method (Section 2), interpret LOC within a Bayesian framework and describe how it can be derived as a reduced form of two previously published methods (Section 3), discuss the variance of the reconstructed series (Section 4) and the inﬂuence of measurement error (Section 5), and end with a few concluding remarks (Section 6).

2. Shortcomings of LOC The most obvious shortcoming of the LOC method is the behavior of the local reconstructions in the limiting case of the proxy series being independent of the corresponding true temperature series. As the proxy series becomes less informative of the temperature

is well deﬁned from a mathematical standpoint, but the variance of the predictions will still be inﬂated, and the temperature reconstructions unreliable. Bayesian inference (discussed below) provides a formalism for regularizing the inference and avoiding unphysical variance inﬂation in the presence of weakly informative predictors.

Another shortcoming of the LOC method concerns the treatment of measurement errors. It is well known that when both the predictor and response variables in a regression are subject to measurement errors, inference on the model parameters is ill deﬁned without additional information on the variance of the error terms (e.g., Fuller 1987). If such information is unavailable, it is possible to establish bounds on model parameters, but precise parameter estimation is not possible. The indirect regression underlying LOC bypasses the estimation of the measurement errors in the proxy and instrumental observations, and the ˆ resulting 1/λi is in fact an estimate of the upper bound of βi, rather than of βi itself, in the model Tij = γi + βi Pij + ǫij with Pij contaminated by errors of unknown variance. This upper bound corresponds to the case when the Pij are observed subject to measurement errors, but the observations of Tij are free of such errors (Fuller 1987, Ch. 1.1.3). If the Tij

biased high, and the variance of the reconstruction artiﬁcially inﬂated. Given that the instrumental temperature observations are contaminated with substantial noise (Brohan et al.

2006), inverse regression must be used cautiously to avoid overcorrecting the attenuation

be mitigated by specifying a model that explicitly takes into account measurement errors in both Pij and Tij, or by specifying a parametric model for the target process within a hierarchical model; both of these possibilities are discussed below.

The methodology used to combine the local reconstructions into a regional or hemispheric average is likewise unsatisfying. The fact that certain proxy series may have weaker relationships with local temperatures than others, and thus that the inferred temperature at those locations are less reliable, is not taken into account. In addition, the spatial covariance of the underlying temperature process is not exploited (cf. Tingley and Huybers 2010a).

These two issues can both be resolved via hierarchical modeling, as discussed below. Given that C2010 implicitly assume that the local reconstructed temperatures at a given year are independent of one another (no spatial covariance), and that the target of the analysis is to infer Northern Hemisphere (NH) mean temperature, the advantage of ﬁrst reconstructing local temperatures is unclear. As an alternative, C2010 could use each proxy series to directly reconstruct the hemispheric average temperature, and then take the average over these reconstructions (cf. Li et al. 2010).

A ﬁnal concern with LOC is that the uncertainty introduced at the various stages of the

of the NH mean temperature. Indeed, C2010 provides no estimates of the uncertainty in the reconstructed temperature series (regional or local) and the word ‘uncertainty’ does not appear anywhere in the text. Given the increased focus on uncertainty quantiﬁcation throughout the climate sciences (e.g., Solomon et al. 2007), a method that does not produce internally consistent uncertainty estimates is not satisfying. Bayesian inference (discussed below) is the obvious tool to rectify this lack of uncertainty quantiﬁcation.

3. LOC in a Bayesian Framework Many of the weaknesses in the LOC method can be avoided by placing the key underlying assumptions into a Bayesian framework. Indeed, the indirect regression that forms the basis of the LOC method [Eq.(1)] has the form of a linear data-level model in Bayesian hierarchical model. From a Bayesian perspective, the local temperature prediction, conditional on the proxies and model parameters, takes the form,

where [A] denotes the probability density function (PDF) of the random variable A and [A|B] the distribution of the random variable A conditional on the random variable B. Note that the distribution for Pij |Tij, λi, αi, σi is equivalent to the form of Eq.(1): the indirect regression is the natural linear data-level model within the Bayesian framework. As discussed below, even a weakly informative prior for [Tij ] regularizes the inference and thus avoids the problem of unbounded variance in the presence of uninformative proxies. It is worth emphasizing that Bayesian models treat temperature as a random variable rather than an unknown, but ﬁxed, quantity.

Estimation of model parameters in LOC is conducted using co-located instrumental and proxy observations during a calibration interval. The calibration stage can be expressed as a Bayesian regression,

where a subscripted dot refers to all time points in the calibration interval. The likelihood, [Pi· |Ti·, λi, αi, σi ], is a product of normals of the form in Eq.(3), while the prior for the model parameters, [λi, αi, σi ], may be more or less informative. See standard texts (e.g., Gelman et al. 2003) for details on Bayesian regression models.

The key advantage of putting LOC into Bayesian framework is that a draw from the posterior of the parameters in Eq.(4) can be plugged into Eq.(3), which results in a draw of the temperature at each time and location for which there is not an instrumental observations.

The end result is an estimate of the posterior distribution of the temperature at these times and locations. Likewise, draws from various locations can be averaged at each year to produce a posterior distribution of the global or regional mean time series. Posterior draws of the local temperature time series are less variable at locations where the corresponding proxy series have strong relationships with local temperature (small σi ). As a result, the mean of the posterior distribution of the global or regional time series is more inﬂuenced (regardless of the weights used to calculate the regional mean) by those proxies with strong relationships with local temperature. The Bayesian inference scheme provides a framework for propagating uncertainty through the analysis in an internally consistent manner, as uncertainty in the estimation of regression parameters is included in the ﬁnal estimate of uncertainty in the global or regional mean time series. In contrast, uncertainty introduced at numerous stages of the LOC method is not propagated, and the resulting reconstructions (Figs. 3,5,7,9,10,11 from C2010) include no estimates of uncertainty.

The Bayesian interpretation of LOC can be understood as a special case of two previously published methods for reconstructing past climate.

a. LOC as a special case of Li et al. (2010) Li et al. (2010) describe a method for reconstructing the NH mean temperature time series using both proxy time series and estimates of solar, volcanic, and greenhouse gas forcings. The three representative proxies considered are tree-rings, pollen assemblages and borehole temperature proﬁles, which are modeled as reﬂecting temperature variability at three diﬀerent temporal scales. A forward model for each type of proxy links the true temperatures to the proxy observations, and a second order autoregressive [AR(2)] model is assumed for the errors.

The indirect regression model in LOC is essentially a special case of the data models [Eq.(4.1-4.3)] in Li et al. (2010). More speciﬁcally, if we consider only one type of proxy, say, the tree-rings in Eq.(4.1) of Li et al. (2010), replacing MD with the identity matrix and the AR(2) errors ǫiD with IID normal random errors results in the LOC model [Eq.(1)], with the caveat that Li et al. (2010) calibrate using NH mean temperatures rather than local temperatures. While both LOC and Li et al. (2010) explicitly model the errors in the proxies, LOC does not consider the eﬀects of errors in the model parameter estimates. In addition, LOC does not include a model for the temperature processes or include priors, both of which provide regularization when the proxies signal to noise ratio is low.

The Bayesian model of Li et al. (2010) explicitly accounts for the fact that diﬀerent proxies reﬂect temperature at diﬀerent time scales, includes a simple energy balance model at the process level, and speciﬁes a more ﬂexible error structure that allows for temporal correlation. In contrast to LOC, the Bayesian model of Li et al. (2010) propagates uncertainty introduced at each level of the hierarchy, and quantiﬁes the uncertainty in the ﬁnal estimate of the NH mean temperature time series.

b. LOC as a special case of BARCAST The BARCAST approach to reconstructing past climate, described in Tingley and Huybers (2010a), is based on the data-level assumption that each proxy observation reﬂects strictly local (in space and time) information about the target process. Each type of proxy observation is assumed to share a common, linear relation of the form Eq.(1) with the target climate ﬁeld. The instrumental observations are modeled as the true underlying value of the ﬁeld perturbed by additive white noise with constant variance. At the process level, the target ﬁeld is assumed to be a multivariate AR(1) process, with a common AR(1) parameter for all locations. Spatial structure enters through the innovations which drive the AR(1) process, which are assumed to be draws from a mean-zero multivariate normal distribution, with an exponential spatial covariance function (see, for example, Banerjee et al. 2004).

In order to recover LOC from BARCAST, the data-level must be altered to specify that each proxy series (as opposed to each proxy type) has a diﬀerent linear relationship of the form Eq.(1) with the local temperature. In addition, the error variance of the instrumental observations must be set to zero to reﬂect the fact that LOC does not consider errors in the instrumental temperature observations. In the limits as the priors on the regression parameters and error variances become uninformative, the data level of BARCAST then reduces to the LOC model.