CADDIS Volume 4: Data Analysis
Exploratory Data Analysis
- What is EDA?
- Mapping Data
Author: D. Farrar
Multivariate Approaches for Exploring Associations Among Stressor Variables
In biological monitoring data, sites are almost always affected by multiple stressors. Exploration of stressor correlations may help to avoid pitfalls in data analysis, especially if undertaken before attempts to relate stressor variables to biological response variables. For example, a type of bias known as confounding occurs when an attempt is made to evaluate effects of one stressor while ignoring other, correlated stressors. In regression modeling, study of associations may also help in choosing a set of predictor variables that minimizes the problem of collinearity.
Scatterplots and bivariate correlations can provide useful information on associations between variables but insights can be limited from only examining pairs of variables. Basic methods of multivariate visualization can often help one better understand associations between stressor variables. For example, variable clustering identifies blocks of variables that tend to be mutually correlated, based on a matrix of pairwise variable correlations (Figure 1). Information of a similar type is obtained using principal components analysis.
Some graphical methods provide, in addition to visualization of relationships among variables, information on stressor profiles for individual sampling locations that may help the analyst to define regions or other groupings of sampling locations with distinctive stressor profiles.
You can read more about exploring stressor associations here.