CADDIS Volume 4: Data Analysis

Predicting Environmental Conditions from Biological Observations (PECBO) Appendix

Loading Data

Data of various formats can be loaded into R. The sample data supplied in this module is formatted as tab-delimited text and can be loaded using the following command:

data.set <- read.delim(filename)  # Loads a tab-delimited data file

Download the sample data and place the files into your working directory. Then, load these data files into R as follows:

site.species <- read.delim("site.species.txt")
site.species.or <- read.delim("site.species.or.txt")
env.data <- read.delim("env.data.txt")
env.data.or <- read.delim("env.data.or.txt")

Examine the data files you loaded and review how they are formatted:


The site-species data has a column with a site identifier and then a column for each taxon and a row for each site. Each numerical entry indicates the number of individuals for a given taxon and a given site.

The environmental data has a column with a site identifier and a column for each environmental variable (stream temperature, "temp"; and log-transformed percent sand and fines, "sed").

Next, merge data files so that each set of environmental data is matched with the appropriate biological data.

dfmerge <- merge(site.species, env.data, by = "SITE.ID")
dfmerge.or <- merge(site.species.or, env.data.or, by = "SITE.ID")

