Thursday, April 8, 2010

First post: general protocol for estimating critical threshold values

In our first phone conversation we decided that there has been good progress on data-rich cases, but thus far we really haven't done much with the data poor cases. My priority tasks which came away from that conversation were to begin thinking about the data-poor or data-free cases and begin working on a case study for the book chapter. Afterward, Mark and I spoke and decided that "square one" is really outlining a general procedure which encompasses the whole spectrum of cases form data-free to data-rich. Once that is decided I can work up a data-poor case (we had kicked around the idea of doing desert sand for glen canyon.


So here is my first crack at a general procedure. Please let me know what you think. in the comments section or with posts of your own.

Also, please note for archival purposes I have added a link to the blog that Travis started last year.


General Protocol:

1. Based on past experience and existing knowledge within the group, and review of literature develop a general list of syndromes. For this and following steps, we should not constrain ourselves only to high quality peer-reviewed literature, we should also exploit grey literature such as agency reports and unpublished theses and dissertations of which there are several pertaining to the various regional parks and monuments. Syndromes are general classes of common transitions. For example the shrub encroachment syndrome, and annualization syndrome are widely documented transitions documented from a variety of ecosystems. The specific shrub species or annual species which may be increasing in dominance differ from site to site, but the syndrome is common in many sites.

2. Again, based on past experience and existing knowledge within the group, and review of literature determine a list of ecosystem states which should minimally include a reference state and at least one degraded state. Since the large majority of ecosites are not pristine, a reference state may never have been observed in the ecosite in question and may be a theoretical construct which is a best guess based upon general ecological principles and knowledge of similar ecosystems. On the other extreme, because these ecosites primarily occur in protected parks, a highly degraded state may also be a theoretical construct also based upon general ecological principles and knowledge of similar ecosystems. Number of states will differ among ecosites, and the level of detail in describing states will also vary based on the level of knowledge available.

3. Based on past experience and existing knowledge within the group, determine which syndromes are known or likely to occur in the ecosite in question, and how they may drive or be symptomatic of transitions among states. Construct a state and transition model, specify discontinuous transitions, i.e thresholds, and continuous ones.

4. At this point, the process for calibrating our models, estimating our level of confidence, and making the best possible estimate of critical values determining threshold behavior in our models diverges depending on data availability. In the data-rich case, existing data is used to determine empirical support for the existence of at least some of the states, determine if transitions exhibit continuous or threshold behavior, and possibly make a rough estimate of cutoffs in threshold dynamics. Data-poor cases and cases where there simply is no useful data, will rely more heavily or exclusively on surveys of expert opinion.

Data-rich cases

1. Collect all data which might represent characteristics of states (e.g. abundance of dominant species), and key variables related to transitions (e.g. proportional abundance of exotics, soil stability, etc.). Use a multivariate technique (e.g. clustering analyses, MRPP, MANOVA, ANOSIM, PERMANOVA, Mantel tests) to determine if data fall into groups which are consistent with at least some of the states described in the model. Ordination techniques including PCA, or NMDS are useful for visualization. Since some states may never have been observed, or may have been missed in sampling, it is not necessary to validate the existence of all of them. This step may however identify states which should be added to the S&T model.

2. Under a random or gradient sampling scenario it should be possible to distinguish linear and gradual transitions, and discontinuous thresholds. States which continuously transition into other states should exhibit a degree of overlap, whereas those that follow a threshold dynamic should be exhibit separation. Targeted sampling strategies of “pure” examples of various states may result in an overestimation of the separation between states. Since P-values are dependent upon sample sizes, and sample sizes will vary in every dataset, it may be best to use a variance explained metric to evaluate the separation of clusters.

3. Revise, S&T model if needed. If all apriori constructs were validated, we have “very high” confidence in this model. If most apriori constructs were validated, but some new revisions were identified by the data and need confirmation in future work, we have “high confidence” in the model. If most apriori constructs were either not observable in the data, or observable but in need of revision we have “moderate confidence” in the model. Since we are dealing with ideas that are not certainties, it is important that we provide some measure of our degree of belief in the model. Alternatively, we may want to describe separate degrees of belief for various components of the model.

4. Finally, whenever possible in cases of apparent threshold behavior an appropriate univariate model should be applied to determine a first approximation of critical values in indicators which may be closely related to specific syndromes or transitions. For example, suppose a transition among two states is driven by a syndrome of trampling of BSCs and increased mobility of surface soils. If the data contain indicators of BSC development (e.g. cover), and soil stability (e.g. Herrick soil stability test), these variables can be used as continuous or semi-quantitative predictors of state membership, a binary response. These sorts of questions are well handled by logistic or probit regression (also structural equation modeling with binary reponses if we also wish to model relationships among more than one predictor). The point at which the response curve begins the shift form one state to another identifies our estimate of the critical value. If this is possible in more than one dataset, the average may be used. Alternatively, Bayesian approaches might be useful in integrating prior knowledge, and interatively improving the estimate with new knowledge.

Data-poor and Data-free cases:

When considering the cases where data is sparse or lacking, we must remember that S&T models are merely a method of organizing our hypotheses, they are not “truth”. A model by definition is never “truth”, it merely allows us to approximate truth to varying degrees in a manageable format. While many of our models might not be very detailed or accurate, a model is useful as long as it is more useful than having no model at all. Our goal is simply to do the best job we can with the information we have to work with in order to produce useful models.

1. Compile a list of regional experts who may have ecological knowledge of the ecosites in question, or at least similar ones. It is not necessary for a given expert to have knowledge of all ecosites or even more than one. It is also not necessary for them to have intimate knowledge or long-term experience with a given ecosite, only more than no applicable knowledge or experience. Our goal should be to have at least 3 experts per ecosite, and if we have more that is even better. Examples for the various Pinyon-Juniper ecoystems: Lisa Floyd-Hanna, Neil Cobb, David Huffman, David Breshears, Kitty Gehring, Tom Whitham, various NAU graduate students in the Forestry or Biology departments working in Pinyon-Juniper ecosystems. Engage the experts to ascertain their willingness to participate. Try not to overrepresent one particular working group, as they may have somewhat redundant opinions.

2. Construct the first iteration of an S&T model using our past experience and knowledge and literature, similarly to steps 1-3 in the data-rich protocol.

3. Incorporation of expert opinion will have two phases. In the first phase, willing experts will be asked to view either an oral or written presentation of our draft S& T models. They will also be briefed on the specific ecosite type for which the S&T model has been prepared. They will be asked the following questions: a. Does the model lack any important states? If so please describe and also explain relevant transitions to and from that state. b. Does the model contain states which should be removed? If so which ones? c. Does the model lack key transitions, or represent transitions incorrectly (e.g. a continuous transition is really a threshold transition, or vice-versa)? If so please describe. Remembering that we never have complete confidence in a hypothesis until we see the data, but we are often much more confident about some hypotheses than others prior to data collection, please answer the following questions as well and as honestly as you can: d. On a scale of 0-100%, please estimate your apriori (pre-data) confidence that a S&T model which takes into account any alterations which you proposed above is a good, reasonable representation of dynamics in the ecosite. e. On the same scale, please estimate your apriori confidence that ANY scientist, inclusive of yourself, could propose such a model which is a good, reasonable representation of dynamics in the ecosite.

4. Use answers a – c to calibrate and produce a second iteration of the S&T model. Identify key threshold behaviors in the model, and the process involved, and commonly used metrics that might provide some information on outcomes of that process. Preferably these are metrics currently in use by the NPS I&M program (e.g. Vital Signs), but may not be in some cases. For example, susceptibility to fire might be related to connectivity of grass patches which might in turn be well-described by average interspace length.

5. Estimate overall degree of confidence in the second iteration of the model. Responses to question e will be used to adjust responses to d, to account for the tendency of some people to be more optimistic or pessimistic than others. For example, af a respondent’s answer to e was 20% less than the mean response to e, their response to d will be up-adjusted proportionally. After adjustment, confidence values from question d will be averaged to estimate overall confidence in the model.

6. Redistribute the second version of the model to the willing experts. This time we will ask them to focus on the key threshold behaviors in the model, and the indicators which we feel are most directly related. We will explain the measurement scale of the indicators. Then we will ask them, for each threshold: aa. To the best of your ability estimate the value of the indicator beyond which transition form one state to another is imminent (e.g., if BSC cover drops below 15% there will be insufficient protection from erosion, and a state transition will occur). Your answer should be a value, rather than a qualitative response. Again we will ask them to asses their confidence with the tandem questions: bb. On a scale of 0-100%, please estimate your apriori (pre-data) confidence that a this value is a good, reasonable estimate of the critical value in this indicator signaling that state transition is imminent. cc. On the same scale, please estimate your apriori confidence that ANY scientist, inclusive of yourself, could propose a good, reasonable estimate of the critical value in this indicator signaling that state transition is imminent.

7. We will make degree of confidence estimates for each estimate of a threshold value using the same protocol outlined in 5 above. The actual final estimate for each threshold will be produced using an averaging procedure weighted by the respondent’s adjusted confidence. A weight for each response will be calculated by dividing the adjusted confidence value by the sum of all confidence values. The values provided by each respondent will be multiplied by their corresponding weight, and all of the resulting products summed. Example: Asked how much crust cover signals a threshold beyond which state transition is imminent…respondent 1 answers 20%, confidence in own estimate is 70%, and confidence in any scientist’s estimate is 70%, respondent 2 answers 50%, confidence in own estimate is 15%, and confidence in any scientist’s estimate is 50%. First we calculate the mean confidence in any scientists estimate (AVE (70, 50) = 60). We find that respondent 1 is 10% more optimisitic than the mean, and respondent 2 is 10% more pessimistic than the mean. The respondent’s own confidence estimates are adjusted accordingly; respondent 1 (70 – 0.10 (70)) = 63; respondent 2 (15 + 0.10 (15)) = 16.5. Mean degree of adjusted confidence for this transition is the average of 63 and 16.5 = 39.75. To calculate the weights for each respondent’s estimates, we divide the adjusted confidence values by the sum of all adjusted confidence values; respondent 1: 63/ (63+16.5) = 0.79, respondent 2: 16.5/(63 + 16.5) = 0.21. We then compute a weighted average of threshold estimates using these weights: 20 (0.79) + 50 (0.21) = 26.3. In the end we can say, with a confidence level of about 40% (moderate confidence) that the critical threshold in crust cover is about 26%del itself and the threshold estimates to the greatest degree possible, similar to steps 2 and 4 under the data rich scenario. This is another scenario where simple Bayesian techniques may be useful to integrate an expert opinion prior, with information from data, to obtain a better estimate.

2 comments:

  1. Mark's comments on protocol via email: my only comment is that transitions that appear linear / gradual on the basis of measured structural attributes may be quite nonlinear with respect to unmeasured processes. Just need to keep the process perspective in mind when deriving classification thresholds on the basis of structural attributes commonly measured in assessment and monitoring programs (I'm reminding myself here....).

    My favorite ref re the distinctions and relationships among different 'types' of thresholds is Brandon's paper (Bestlemeyer 2006)

    ReplyDelete
  2. was an error in the numbers in point 7. it has been corrected.

    previously read "....are adjusted accordingly; respondent 1 (70 – 0.10 (50)) = 63"

    now reads "....are adjusted accordingly; respondent 1 (70 – 0.10 (70)) = 63"

    hopefully it makes more sense.

    ReplyDelete