Alternative Process for Developing a Tier 2 SSRO

Stantec Consulting Ltd.

September 9, 2013

 

Executive Summary

The current Tier 2 pass/fail process described for in situ contamination in the Alberta remediation guidelines involves a post-remediation eco-toxicological assessment to demonstrate minimal risk to ecological receptors exposed via the soil direct contact pathway. Tier 2 site-specific remedial objectives (SSROs) may be established to assist in site management, but are not accepted for regulatory closure unless supported by additional lines of evidence. The objective of this project was to investigate the feasibility of four statistical approaches that might form the basis of an SSRO derivation procedure acceptable for regulatory closure. The impetus for the project came from conducting a carefully designed and robust ecotoxicological assessment of a site where some soil samples failed to satisfy the Tier 2 pass/fail criteria and some passed. The assessment generated many toxicological data (36 biological endpoints) that could be used to derive an SSRO. The initial challenge was to develop an acceptable approach for the SSRO derivation. A subsequent challenge was to determine if models describing the relationships among pedological characteristics, contaminant concentrations, and biological responses could be used successfully to predict “effects” in soil at other sites when the contaminant concentrations and pedological characteristics were known.

Four approaches were investigated. The defined GMR approach entailed calculating the geometric mean of the no-observable-adverse-effects concentration (NOAEC) and the lowest observable-
adverse-effects concentration (LOAEC) for each biological response variable (i.e., measurement endpoint). The distribution of the geometric means was used to determine threshold effect concentrations. The 25th percentile of the distribution of the ranked geometric means would provide fraction-specific remedial objectives for agricultural/residential land uses for these soils that would protect 75 % of the “species”, while the 50th percentile would provide the remedial objective for commercial/industrial land uses that would protect 50% of the “species”. Although there are different ways to generate the distributional data, one was selected in consideration of reliability, repeatability, uncertainty, and the degree of conservatism.

The remedial objectives derived using a distribution of the bounded geometric means for soils contaminated with residual PHCs (F3) were lower than current Tier 1 standards for F3 in soil, despite several of these soils passing a Tier 2 Pass/Fail Assessment. Thus, this approach for developing Tier 2 SSROs was not pursued further because the degree of conservatism in the approach remained high.

Some of the challenges to constructing an alternative Tier 2 process are due to the interactions between the physical and chemical characteristics of the site soils and the biological responses as well as among the site physical and chemical characteristics themselves. Therefore, a second approach (e.g., data reduction and model averaging or DRAMA approach) involved the exploration and critical evaluation of the data using a series of established statistical procedures to assess the relative importance of potential explanatory variables as well as the interaction and potential redundancy between and among site physical and chemical characteristics. The latter was addressed through the creation of synthetic variables using ordination. Correlations between site physical/chemical characteristics, synthetic variables, contaminants and toxicity tests responses were explored using a suite of ecotoxicologically plausible model structures. Due to the nature of the exposure data, mixed effects models were used to account for subsample variation and the non-Gaussian distribution of many of the responses was addressed using generalized linear models. Rather than select a single “best” model, contributions from individual models were “averaged” to create a single model using model averaging. The advantages of this approach were that undue reliance was not placed on a single model and that the true model uncertainty was acknowledged rather than ignored. For the DRAMA approach, the dataset was explored and redundancy as measured by covariation among the non-PHC variables was examined using principal components analysis of the correlation matrix. Some variables were excluded and some retained. The matrix of retained explanatory variables was multiplied by the ordination eigenvectors to create site-specific scores for each principal component. The scores were considered for use as synthetic explanatory variables in lieu of some directly measured explanatory variables. The first, second and third principal components of the non–PHC variables measured consistently across three different studies comprising the preliminary dataset represented 85% of the total variability in the dataset. The ordination suggests using only PC1 scores as a heuristic for soil particle size and concomitants such as nutrients with the additional individual variables pH, soil moisture and clay as potential explanatory variables in subsequent modeling. Models were constructed for 17 toxicological endpoints. Three sets of additive linear models allowed for site effects, site and variable interactions and soil physico-chemical variables only. The model structures were driven by the
following questions:

  1. What is the relative importance of contaminant and non-contaminant heuristics as descriptors of toxicity, i.e., the biological responses?
  2. In addition to the soil texture heuristic (PC1), clay, pH and moisture were flagged as major sources of variability in the studies examined. Are these three variables important descriptors of toxicity?
  3. Does the relative importance of PHCs, clay and non-contaminant heuristics vary by study?

The relative importance of variables was assessed using model averaged coefficients and values for Wald tests of significance for each parameter within a model. The relative
importance of variables was also assessed by summing the AICc (second order Aikake’s Information Criterion for small sample sizes) weights for a model in which a parameter occurs.

The parameters of the models were averaged by weighting the parameters of multiple models fit to the same data using the model-specific AICc rather than shrinkage estimators which include
zeros for parameters in models where the parameter does not appear. This latter procedure was avoided because the degree of shrinkage is a function of the model structures considered for model averaging. Unconditional variance estimates were used in Wald tests (Ho: parameter = 0) of model averaged parameter estimates. A measure of relative variable importance was measured as the sum of AICc weights over all models including the explanatory variable.

Because the studies used for this feasibility assessment were conducted for other purposes, test species changed from study to study and the pedological variables that were measured were inconsistent among studies. This reduced the number of pedological variables and test species that could be evaluated. Generally, it was concluded that the dataset would benefit from improvement to the study and experimental design.

A third approach, the Partial Least Squares (PLS) Regression approach, was investigated because it had been used previously with success to investigate large “noisy” datasets in order to identify and assess the existing signals. Models predicting various ecotoxicological endpoints resulting from exposure of ecological receptors to contaminated soil were developed, by combining multiple predictors in the same model through the application of multivariate statistical methods. Since multivariate statistics consider many variables simultaneously, they can detect meaningful trends that might not be identified by traditional univariate analyses. PLS regression is a multivariate statistical method that was used to model the relationship between multivariate predictor matrix (X) and a response matrix (Y), which could include either single or multiple responses. Analogous to simple linear regression models, PLS provides an assessment of the strength of the relationship between X and Y (i.e., the percent of variation in Y that can be explained in terms of the variation of X), and can also be used as a foundation for predicting the “Y values” of future unknown observations based on their known X data (which can be measured). By using soil physico-chemical properties, non-exhaustive chemical extraction results, and measured bioaccumulation values in the X matrix and either individual toxicity endpoints or a matrix of multiple toxicological endpoints as the Y matrix, a multivariate model was constructed that is capable of predicting the relative toxicity of various soils to key ecological receptors based on purely physico-chemical measurements. The SSROs for a site could be derived then based on the distribution of the predicted relative toxicities.

The matrix analyses comprising data from only one site with PHC- and metals-contamination used pedological characteristics as the ‘X’ matrix of multiple predictors and each of the biological responses (endpoints) as a separate ‘Y’ matrix. PLS models were cross validated using leave-one-class-out cross validation (LOCOCV) and the number of components that maximized the internally cross-validated R2Y value (reported as Q2Y) was selected as the number of components for each final PLS model. For each PLS model, the explained variation of X and Y (R2X and R2Y) were reported to indicate how well the model fit the training data and Q2Y was reported as a preliminary measure of the predictive ability of the model. In addition, the significance of each PLS model was estimated through response permutation testing. Statistical significance was assessed at α ≤ 0.05.

Significant models were created for several endpoints; however, for a number of models the significance of the model was generally fairly low. That said, it was clear that the noncontaminant variables were either more, or as, important as explanatory variables as the contaminant variables and that PHCs were important explanatory variables in all of these models which corroborated the results of the DRAMA and SEM approach.

The results of the PLS approach demonstrated that it was possible to link multivariate soil properties to certain ecotoxicity endpoints. However, the analyses also highlighted that the
predictive power of these models is likely to be inadequate for soils with soil properties that vary substantially from the soils used to build the initial model. The small sample size might be responsible. Predictive power of the models might be improved by increasing the number of site soils in the model-building exercise and model averaging might strengthen the cross-site
predictive applicabilities.

Structural Equation Modeling (SEM) was the fourth approach investigated as a potential alternative Tier 2 process. Data for multiple species and endpoints from toxicity tests were incorporated into a single analysis through use of a latent variable and subsequent EC/IC25 and EC/IC50 values were estimated; models were developed for predicting toxicological responses that incorporated both contaminant levels and environmental covariates. The focus of this investigation was the relationships among endpoint responses, as represented by an “aggregate response” latent variable; and, constructing structural models to describe the causal relationships among the non-contaminant and contaminant variables in the model. Upon construction of the models, based on these relationships, covariate models were derived in order to predict “effects” or “impacts” to ecological receptors for sites for which toxicity data were either minimal or lacking. Cross-site models were investigated with the intent to implement them as predictive tools.

SEM has two components, the measurement model and the structural model. The structural model consists of the paths between variables, while the measurement model consists of a latent variable and its associated observed indicator variable(s). Latent variable modeling has two major advantages: 1) it can be used to estimate the general species response across a range of toxicant concentrations; and 2) it allows estimates of measurement error to be incorporated into the model and measurement error is implicitly included as imperfect correlations among the indicator variables. Measurement error is rarely explicitly considered, yet is nearly always present to some extent in data. In the model fitting process unacknowledged measurement error can cause problems in the estimation of path coefficients. For example, if measurement error is present in an explanatory variable, the residual error variance will contain both prediction error and measurement error, and as a result the true strength of the relationship between the response and explanatory variables will be underestimated. This underestimation of the true strength of the relationship can cause a downward bias in both the unstandardized and standardized estimates of path coefficients in the structural model. The structural (path) model describes the causal relationships among the variables in a model. The structural model consists of either the paths between latent variables or, in an observed variable model, direct relationships among observed variables.

The utility of the SEM approach showed promise; most of the challenges encountered were related to the size of the preliminary dataset. This was not surprising given the “data hungry” nature of the approach. The major outcome of this investigation was the prospective use of confirmatory factor analysis to aggregate multiple endpoints into a single latent variable that could then be incorporated into standard non-linear modeling procedures to estimate IC25 values. This provides a direct solution to the problem of reconciling divergent ICp estimates from individual endpoints. In particular, the confirmatory factor analysis was uniquely able to identify endpoints that may not be responding in the same manner as the majority (variables with weak and/or non-significant loadings on the latent variable). With this approach the toxicologist can determine with confidence whether all of the endpoints are providing equivalent information and, if so, develop a single IC25 estimate from the latent variable using standard procedures. The overall goal of this project was to develop analytical methods that could incorporate environmental covariates into analyses of toxicological responses and to develop cross-site predictive models that could be used to estimate provisional remediation targets based on readily measured environmental variables. Models were constructed that linked an aggregate species response variable based on two earthworm endpoints, two collembolan endpoints, and four northern wheatgrass endpoints to toxicant concentrations and measures of soil quality. These cross-site models are promising, but not ready for implementation in a predictive mode. The models successfully explained the aggregate species responses (R2 > 0.7), but failed many tests of model adequacy (significant χ2, low CFI etc.). A small sample size relative to the complexity of the models was a major impediment to the implementation of these models.

Recommendations to improve the validity of three of the Tier 2 alternative approaches were made. Recommendations for standardization of the choice of toxicity endpoints and environmental co-variates were made that would improve the utility of these data for these types of assessments. Modifications were recommended to optimize sampling designs to improve the utility of the predictive models.

Main Body of Report

Technical Appendices

Full Report

# 09-9193-50