Aggregator Specification File Format

The Aggregator

The Aggregator will take TRIM results files and will aggregate data in the columns into a smaller number of columns. The aggregation could involve summing, averaging or other operations. An example of where this might be useful is for calculating the mass results for all biotic and abiotic compartments. The Aggregator can also accept input from multiple files. This allows you to produce a summary of the results from several files and combine them into one file.

The Aggregator is run from a DOS command line using the following options:

Option

Required?

Can be repeated?

Description

-i

This option is used to specify the input file(s) the user wants to aggregate. Multiple input files can be specified by using this option repeatedly (this is useful when comparing or merging data from multiple files).

-o

This option specifies the output file(s) to be produced. The model can produce output data in three formats: text file with delimiter used by input file (indicated by an output file name ending with .txt), comma-delimited text (indicated by an output file name ending with .csv) , and HTML (indicated by an output file name ending with .html). The user can select multiple output formats by using this option multiple times.

-f

This option specifies the format file that will be used. The format of this file is described below.

-d

This option indicates the data delimiter used in the input file(s). The default is semicolon delimited (";").

-h

This option provides a header/title that will be used at the top of each of the output files.

For example, if the user wanted to aggregate 3 files into one comma-delimited files and one HTML file, the command line might look like:

C:\java gov.epa.trim.util.TRIMResultsAggregator -i input_1.txt -i input_2.txt -i input_3.txt -o output.csv -o output.html -f format.txt -h "This is an example".

In this example, the Aggregator would generate 2 output files with the header "This is an example". Because the TRIM output files are semi-colon delimited, it was not necessary to indicate the delimiter using the -d command.

In addition to being run from a DOS command line, as shown in the example above, the Aggregator can also be run from a DOS batch file. This is done by simply making a text file with the extension .bat and then including one or more DOS command line statements. The user can easily run the Aggregator multiple times using a single batch file. In addition, this batch file could simply be edited (instead of creating a new one) for different applications of the Aggregator.

Format Files

The format file consists of groups of commands, with each group providing the necessary formatting instructions for a single column in the output file. There is no limit on the number of groups and each group must contain (in order): a column name, an operation, and a list of input columns.

The first line must contain "version; 1".

To specify the column name, use the "output" keyword followed by the name of the output column.

To specify the type of operation that you want to perform on the columns, use the "operation" keyword followed by one of the following operations:

Operation	Description	Function	Maximum Number of Input Files
sum	Sums all of the inputs	in1+in2+...in[n]	unlimited
average	Averages all of the inputs	(in1+in2+...in[n])/n	unlimited
diff	Calculates the difference between 2 inputs	in1 - in2	2
mult	Multiply 2 inputs	in1 * in2	2
ratio	Calculates the ratio of 2 inputs	in1 / in2	2
percent	Calculates the percent of the sum of the 2 inputs comprised of in1	[in1 / (in1+in2)] x 100%	2
percentdiff	Calculates the percent different between 2 inputs	[(in1 - in2) / (in1+in2)] x 100%	2
copy	Copy an input column into the output column	in1	1

The input columns are listed on a 'per file' basis. This means that the user must specify a file number (e.g., the first input file in the command line would be in1, the second would be in2, and so forth) and then a "column selector", separated by a semi-colon. A column selector can be the complete name of a compartment, or a combination of a column keyword and a string. The different column keywords are described in the table below.

Keyword	Description
{all}	This keyword will select all of the columns in a file and will use them to compute the output.
{include}	This keyword will select all of the columns in a file containing the text following the keyword and will use them to compute the output.
{exclude}	This keyword will exclude columns containing the text following the keyword and will use the remaining columns to compute the output. This keyword requires that some columns have already been selected using the {all} and/or {include} keywords, or using the exact column name.

Here are some examples of uses of input columns:

output; Trees
operation; sum
input; in1; Oak
input; in1; Elm
input; in1; Cherry
input; in1; Maple
input; in2; Dogwood

This would sum the Oak, Elm, Cherry, and Maple columns in in1 and the Dogwood column in in2 into an output column called "Trees"

output; Trees
operation; sum
input; in1; {all}
input; in2; {all}

This would sum all of the columns in in1 and in2 into an output column called "Trees"

output; Air
operation; sum
input; in3; {include}Air

This would sum all of the columns in in3 containing the string "Air" into an output column called "Air"

output; No Air
operation; sum
input; in2; {all}
input; in2; {exclude}Air

This would sum all of the columns in in2 except those containing the string "Air" into an output column called "No Air"

The user can also aggregate previous output columns along with input columns. To do this, the user would specify "out", instead of the file number, followed by a semi-colon and the name of the output column being referenced. For example,

output; Trees operation; sum input; in1; Oak input; in1; Elm input; in1; Cherry input; in1; Maple output; Grasses input; in2; Oats input; in2; Wheat input; in2; Rye input; in2; Bermuda output; Vegetables input; in3; Spinach input; in3; Lettuce input; in3; Cabbage output; Plants operation; sum input; out; Trees input; out; Grasses input; out; Vegetables output; % Trees operation; percent input; out; Trees input; out; Plants output; % Grasses operation; percent input; out; Grasses input; out; Plants output; % Vegetables operation; percent input; out; Vegetables input; out; Plants

This will first create the Trees, Grasses, and Vegetables columns from in1, in2, and in3, respectively. Next, it will sum the three types of plants into a column called Plants. Then it will produce three columns that calculate the percentage ratios of trees, grasses and vegetables.