The Analysis Engine allows users to examine data sets visually. It was initially developed so that environmental modelers could better evaluate their output data, but the Analysis Engine can be used to investigate many types of data sets with different formatting specifications. The tool may be used for sensitivity and uncertainty analyses, model evaluations, emissions modeling quality assurance, and general data analyses. Several statistical tools are conveniently incorporated into the Analysis Engine to assist evaluations. The tool also allows users to examine tabular model results to determine the influence of certain variables, and one of the primary reasons for its creation was to understand the error propagation produced by parameter uncertainties. However, the tools are applicable to many other data set types and analyses.

The Analysis Engine imports ASCII data files and displays them initially as tables. Click the mouse to view other tables that have been imported or analyses that have been performed.

After importing the data sets, you may examine the statistics associated with them by:

Sorting the data by any single column or group of columns

Filtering the top and bottom data from any column

Filtering the data for multiple criteria

Building scatter plots, regression plots, bar plots, box plots, time series plots, discrete category plots, histogram plots, cumulative distribution function plots, line plots, rank order plots, and tornado plots that may be exported in a number of formats (based on R statistical package)

Calculating correlation and regression coefficients

Determining percentile cutoffs

Finding the minima, maxima, means, medians, sums, standard deviations, skews, and kurtoses for any data set or subset

Establishing histogram bins

Conducting advanced data analyses using the Weka Explorer (based on Weka analysis package)

Key Features

Operates on any operating system that can use Java-based applications (e.g., MS Windows, Mac OS X, UNIX, and LINUX)

Easily imports files from environmental and other models (e.g., SMOKE, COSU Monte Carlo, TRIM.FaTE, DAVE, SMOKE, and ARRF)

Operates within the MIMS framework

Allows users to specify the number of significant digits that should be displayed

Lets you specify font styles for important data columns, as well as choose which columns to hide

Is tightly integrated with statistical and data mining tools (e.g., Weka)

Advantages over Commercial Off-the-Shelf Products

Geared toward model analysis so common tasks are facilitated

Displays file paths, file names, headers, and footers to help you track different data displays

Provides a way to export files to be read by other analysis tools

Has no limitations on the number of columns or rows in data sets

Includes computations for sensitivity analysis-related statistics (e.g., elasticity and sensitivity score) through the Colt package

Is available in an open source environment (http://sourceforge.net/projects/mimsfw)

Connects directly to EPA's Multimedia Integrated Modeling System (MIMS)

Allows you to save plot and analysis configurations to avoid repetition when similar data sets are created

Allows data entries to hold "NaN" (not a number) to represent missing data without failure of the analysis

Includes a command line interface that can be used to automate plot production

This Analysis Engine can currently import twelve different file types. Future feature requests may be submitted at the SourceForge website.

This manual is organized into five sections:

Introduction: includes installation instructions, a description of the data table window, and the command line arguments

Menu Functions: describes how to import and export data files and configurations

Toolbar Functions: describes how the tables can be sorted, filtered, and formatted

Statistics Functions: presents the statistical data analysis tools

Plotting Function: presents the instructions to create effective graphs