Data Exploration
Often an early step in any investigation is an exploration of the raw data values. SADA provides many tools that allow
you to visualize your data, perform data screens,
spatially aggregate data, calculate basic statistics, perform hypothesis tests, query
by date, and so forth.
The following catalog some standard tools.
Data Query
You have a great deal of control over how your raw data is processed.
SADA has means to query by date, to deal with nondetected
values, and to deal with duplicate data values. Spatially linked tabular queries are also possible.
Data Screens
A data screen essentially compares sampled data values against a threshold or screening limit.
Screening limits are typically
contaminant specific. SADA allows you to access threshold limits in three different ways:
import your own screening limits, access a database of
ecological benchmarks (shipped with SADA),
or utilize the comprehensive
human health model (includes toxicological and scenario parameter databases shipped with SADA). SADA also permits you to specify depth specific screening values. This can be useful as acceptable limits may
actually vary as a function of depth below the surface.
Data screens can be carried out spatially or with a more traditional tabular output. In the tabular output, each contaminant
essentially receives a "Yes" if any value exceeds the criteria. Multiple contaminants can be screened at once. In a spatial screen, individual sample points are highlighted
with exceedance boxes to show where the exceedances occur. This is a very easy and very powerful way to
quickly visualize the location and pattern of elevated values.
Data Ratios/Sum of Fractions
Ratio values serve a similar role to data screens and can be conducted tabularly or spatially. In a ratio calculation, the representative concentration is divided
by the screening or decision criteria. Values greater than one indicate an exceedance. Ratio values provide more
information than a simple screen as the value of the ratio can indicate the severity of the exceedance.
Sum of fraction maps sum up the ratio values over each contaminant in the tabular format and at each sample location for
the spatial map format. Sum of fractions are often used in radiological assessments where summing ratios does
have a physical and real meaning. Sums greater than one indicate that contaminants, taken together, are exceeding
a collective criteria.
Histograms/Cumulative Distribution Plots
Histograms and CDF plots are good ways to see the distribution of your data set. These plots aid in model selection,
the identification of outliers, and so forth.
Summary Statistics
Summary statistics for the entire data set or a subset of the data can be extracted.
Here we show summary statistics for the data points defined using the polygon feature in the
previous figure.
Summary statistics can be exported as delimited files for import into spreadsheet or
wordprocessor programs. Other statistics that are available include the range, detection frequency,
maximum and minimum detected and nondetected values, and back transformed mean, variance, and UCL95.
Hypothesis Testing
A statistical test is a procedure for deciding whether a hypothesis about a quantitative feature of a population is true or false. These tests are used to determine the statistical significance of a result. Statistical tests separate significant effects from mere luck or random chance.
SADA currently implements two nonparametric tests used by the DQO and MARSSIM processes: sign test vs. decision criteria and Wilcoxon rank sum comparison test.
