Jump to main content or area navigation.

Contact Us

CADDIS Volume 3: Examples & Applications

Analytical Examples

Verified Prediction: Predicting Environmental Conditions from Biological Observations


We would like to determine whether observed changes in the macroinvertebrate assemblage composition at a test site in Oregon is consistent with a hypothesis that temperature has increased at the site. That is, if increased temperature is a stressor at the test site, we predict that the temperature inferred from the impaired macroinvertebrate assemblage is higher than expected. For this example, we establish our expectations for the inferred temperature using a set of regional reference sites.


Macroinvertebrate samples and temperature measurements were collected from small streams across the western United States by the U.S. EPA Environmental Monitoring and Assessment Program.

The Oregon Department of Environment Quality (ORDEQ) deployed continuous temperature monitors in streams from 1997-2002. These temperature monitors recorded hourly temperature measurements that were summarized as seven-day average maximum temperatures (7DAMT). Macroinvertebrate samples were also collected from these sites. Sites were characterized by the geographic location (latitude and longitude), elevation, and catchment area. Reference sites were designated in Oregon based on land use characteristics.

Analysis and results

scatter plots
Figure 1. Temperature inferred from macroinvertebrate data versus measured mean temperature (7 day average maximum temperature). Dashed line shows a 1:1 correspondence.
Inference model development and validation

Relationships between the probability of capture of different macroinvertebrate taxa and stream temperature were estimated in the EMAP data set using logistic regression (Yuan 2007). These models were then combined with observations of different taxa in the Oregon data to predict stream temperature at each of the Oregon sites (see page on Predicting Environmental Conditions from Biological Observations for details regarding these calculations). The accuracy with which the EMAP models predicted Oregon stream temperatures was assessed by plotting temperature inferred from the macroinvertebrate assemblage versus directly measured mean temperature (Figure 1). Agreement between inferred and directly measured temperatures was strong.

Controlling for natural variability

As with directly measured temperature (see spatial co-occurrence example), establishing expectations for inferred temperatures requires that we control for natural variability. Scatterplots were first used to examine the variation of inferred temperature with different natural factors. The factors that are chosen for the predictive model (e.g., elevation, geographic location) must not be associated with human activities. This initial data exploration suggested that stream temperature in reference sites varies with both elevation and latitude (Figure 2).

scatter plots
Figure 2. Relationships between inferred temperature and elevation (top) and latitude (bottom).

A multiple linear regression model was used to quantify the relationship between inferred temperature and latitude and elevation at reference sites. Both elevation and latitude are statistically significant (p < 0.05) predictors of stream temperature, and the model explains approximately half of the overall variability in inferred stream temperature. This model can be used to predict the reference expectations for inferred stream temperature at other sites. That is, the reference expectation for inferred temperature can be calculated as follows:

equation ti=50.3-0.0013E-0.82L

where ti is the stream temperature, E is the elevation of the site in feet, and L is the latitude of the site in decimal degrees.

Site assessment

Since the inference model seemed to provide accurate predictions of stream temperature, inferred temperature can be used to inform the verified prediction type of evidence. That is, we hypothesize that if temperature is the cause of impairment then temperatures inferred from the impaired macroinvertebrate assemblage will be higher than expected.

At the biologically impaired test site of interest we collected a macroinvertebrate sample and used the EMAP inference models to infer temperature at the test site as 21°C based on the macroinvertebrate assemblage. The biologically impaired site is located at an elevation of 1000 feet and latitude of 43° North. The expected inferred stream temperature at the site is predicted using the regression relationship developed from regional reference conditions,

equation ti=50.3-0.0013(1000)-0.82(43)

which gives a predicted reference inferred temperature of 13.7°C. 95% prediction intervals around this mean value are 10.5°C and 17.2°C, so the EMAP inferred temperature of 21°C, based on the collected macroinvertebrate assemblage, is well outside the predicted range of 95% of inferred temperatures at similar reference sites. This finding suggests that inferred stream temperature is indeed elevated at the test site. Hence, the macroinvertebrate assemblage at the test site is one that is characteristic of much warmer streams than we would expect for a stream at this elevation and latitude. At this test site, we have verified our prediction that the observed macroinvertebrate assemblage is consistent with temperatures being higher than expected.

The CADStat PECBO and Regression Prediction tools perform all the calculations described in this example.

How do I score this evidence?

Predictions of increased biologically-inferred temperatures have been verified (+).


Top of Page

Jump to main content.