CADDIS Volume 4: Data Analysis
Predicting Environmental Conditions from Biological Observations (PECBO) Appendix
- Using Existing Taxon-Environment
- Estimating Taxon-Environment
- Computing Inferences
- R Scripts
Topics in R Scripts
Download Scripts and Sample Data
This section is provided for users who are very comfortable with R and who wish to download scripts directly. For novice R users, please note that the web page associated with each script has useful information that will help you successfully run the script.
R scripts from this section are provided as ".R" files that can be saved directly on your hard drive. Each script can be run by executing the following command in R:
The scripts here assume that data have been downloaded and stored in the working directory. Before running any of the other analysis programs, the first script listed should be run to set up R data files.
(To save the files to your hard dive, right click on the script names below and choose "Save Target As...")
- Set up variables
- Weighted Averages
- Cumulative Percentiles
- Parametric regressions
- Nonparametric regressions
- Chi-square tests
- Area under ROC curve
- Curve shape classification
- Identify taxa found in calibration and test data
- Weighted average inference
To estimate multivariate taxon-environment relationships, or to format any taxon-environment relationship correctly for maximum likelihood inferences, you will need to use the scripts provided in the R library
bio.infer. The library also contains the script that computes maximum likelihood inference and other tools.
The library can be installed by typing at the R prompt:
Two sample data sets are provided here to illustrate the analysis methods described in this module. The first data set was collected by U.S. Environmental Protection Agency's Environmental Management and Assessment Program-Western Pilot Project (EMAP-West) from 2000 to 2002, and the second data set was collected in western Oregon by the Oregon Department of Environmental Quality (DEQ) from 1999 to 2000 (Figures 19 and 20). Both organizations used a similar sampling protocol. A reach 40 times the wetted width of the stream was delineated for sampling. Stream temperature was measured at the time of sampling. Substrate composition was estimated by summarizing the size distribution of particles at five locations on 21 transects. For the EMAP-West, macroinvertebrate samples were collected at eight randomized locations in riffles using a modified D-frame kicknet (500 μ m mesh) by disturbing a 1 ft² area for 30 seconds. In Oregon, samples were collected by disturbing 2 ft² areas at four randomized locations. Samples from both studies were composited and spread on a gridded pan and picked from randomly selected grid squares until at least 500 organisms were collected. Each organism was then identified to the lowest possible taxonomic level (usually genus or species).
- Site-species data: EMAP-West
- Environmental data: EMAP-West
- Site-species data: Western Oregon
- Environmental data: Western Oregon