Package 'seawaveQ' reference manual

Title:	SEAWAVE-Q Model
Description:	A model and utilities for analyzing trends in chemical concentrations in streams with a seasonal wave (seawave) and adjustment for streamflow (Q) and other ancillary variables. See Ryberg and York, 2020, <doi:10.3133/ofr20201082>.
Authors:	Karen R. Ryberg [aut, cre], Aldo V. Vecchia [aut], Benjamin York [ctb]
Maintainer:	Karen R. Ryberg <[email protected]>
License:	Unlimited \| file LICENSE
Version:	2.0.2
Built:	2025-03-15 04:27:34 UTC
Source:	https://github.com/cran/seawaveQ

A model and utilities for analyzing trends in chemical concentrations in streams with a seasonal wave (seawave) and adjustment for streamflow (Q) and other ancillary variables (Q) and other ancillary variables

Description

An R package for the U.S. Geological Survey seawaveQ model, a parametric regression model specifically designed for analyzing seasonal- and flow-related variability and trends in pesticide concentrations. See Vecchia and others (2008) for the original description of the model and see Sullivan and others (2009), Vecchia and others (2009), Ryberg and others (2010), Ryberg and others (2014), Ryberg and Gilliom (2015), and Oelsner and others (2017) and for applications of the model.

Details

Package:	seawaveQ
Type:	Package
Version:	2.0.2
Date:	2020--12--14
License:	Unlimited \| file LICENSE

Author(s)

Karen R. Ryberg [email protected] and Aldo V. Vecchia [email protected]

References

Oelsner, G.P., Sprague, L.A., Murphy, J.C., Zuellig, R.E., Johnson, H.M., Ryberg, K.R., Falcone, J.A., Stets, E.G., Vecchia, A.V., Riskin, M.L., De Cicco, L.A., Mills, T.J., and Farmer, W.H., 2017, Water-quality trends in the nation's rivers and streams, 1972–2012—Data preparation, statistical methods, and trend results: U. S. Geological Survey Scientific Investigations Report 2017–5006, 136 p., https://doi.org/10.3133/sir20175006.

Ryberg, K.R. and Gilliom, R.J., 2015, Trends in pesticide concentrations and use for major rivers of the United States: Science of The Total Environment, v. 538, p. 431–444, https://doi.org/10.1016/j.scitotenv.2015.06.095.

Ryberg, K.R. and Vecchia, A.V., 2013, seawaveQ—An R package providing a model and utilities for analyzing trends in chemical concentrations in streams with a seasonal wave (seawave) and adjustment for streamflow (Q) and other ancillary variables: U.S. Geological Survey Open-File Report 2013–1255, 13 p., with 3 appendixes, https://dx.doi.org/10.3133/ofr20131255.

Ryberg, K.R. and York, B.C., 2020, seawaveQ—An R package providing a model and utilities for analyzing trends in chemical concentrations in streams with a seasonal wave (seawave) and adjustment for streamflow (Q) and other ancillary variables, Version 2.0.0: U.S. Geological Survey Open-File Report 2020–1082, 25 p., with 4 appendixes.

Ryberg, K.R., Vecchia, A.V., Gilliom, R.J., and Martin, J.D., 2014, Pesticide trends in major rivers of the United States, 1992–2010: U.S. Geological Survey Scientific Investigations Report 2014–5135, 74 p., https://pubs.er.usgs.gov/publication/sir20145135.

Ryberg, K.R., Vecchia, A.V., Martin, J.D., Gilliom, R.J., 2010, Trends in pesticide concentrations in urban streams in the United States, 1992–2008: U.S. Geological Survey Scientific Investigations Report 2010–5139, 101 p., https://pubs.usgs.gov/sir/2010/5139/.

Sullivan, D.J., Vecchia, A.V., Lorenz, D.L., Gilliom, R.J., Martin, J.D., 2009, Trends in pesticide concentrations in corn-belt streams, 1996–2006: U.S. Geological Survey Scientific Investigations Report 2009–5132, 75 p., https://pubs.usgs.gov/sir/2009/5132/.

Vecchia, A.V., Gilliom, R.J., Sullivan, D.J., Lorenz, D.L., and Martin, J.D., 2009, Trends in concentrations and use of agricultural herbicides for Corn Belt rivers, 1996–2006: Environmental Science and Technology, v. 43, no. 24, p. 9096–9102.

Vecchia, A.V., Martin, J.D., and Gilliom, R.J., 2008, Modeling variability and trends in pesticide concentrations in streams: Journal of the American Water Resources Association, v. 44, no. 5, p. 1308–1324, https://dx.doi.org/10.1111/j.1752-1688.2008.00225.x.

Scatter plot of water-quality data

Description

Function to generate a scatter plot that indicates censored and estimated water-quality concentrations.

Usage

cenScatPlot(
  data,
  datescol = "dates",
  pname,
  qwcols = c("R", "P"),
  site = "",
  xlabel = "",
  ylabel = "Concentration",
  legpos = "topright",
  legcex = 1,
  ...
)
cenScatPlot(
  data,
  datescol = "dates",
  pname,
  qwcols = c("R", "P"),
  site = "",
  xlabel = "",
  ylabel = "Concentration",
  legpos = "topright",
  legcex = 1,
  ...
)

Arguments

`data`	is the dataset with columns that begin with P followed by alphanumeric characters indicating concentration data and columns that begin with R followed by alphanumeric characters that match those of the concentration data indicating qualification codes. See example datasets for more information about the data format, see `IllRivValleyCty` and `qwMoRivOmaha`.
`datescol`	is the column label for the dates column.
`pname`	is the column heading (parameter name) for the particular water-quality constituent to be plotted (omit the starting character, for example for sulfate data indicated by P00945, enter "00945").
`qwcols`	is a character vector with the beginning of the column headers for remarks code (default is R), and beginning of column headers for concentration data (default is P for parameter).
`site`	is a label for the plot title indicating the site where the water-quality samples were collected.
`xlabel`	is the label for the x-axis, defaults to no label.
`ylabel`	is the label for the y-axis.
`legpos`	is the position of the legend, see legend.
`legcex`	is a numerical value giving the amount by which the legend text and symbols should be magnified relative to the default, 1.
`...`	arguments to be passed to plot method.

Details

This function uses the qualification, or remark, column associated with water-quality concentration values to indicate which samples are unqualified, which are estimated, and which are censored. A blank remark field or an "_" indicates that the concentration value is not qualified; an "E" indicates the value has been estimated; and a less than symbol, "<", indicates the value has been censored as less than a minimum reporting level. See Oblinger Childress and others (1999) for information on the minimum reporting level and the definition of "E" for U.S. Geological Survey data. Other users may have a different definition of the minimum reporting level, but censored values need to be qualified with a "<". Using the "E" code is optional.

Value

A scatter plot

Author(s)

Karen R. Ryberg

References

Oblinger Childress, C.J., Foreman, W.T., Connor, B.F., and Maloney, T.J., 1999, New reporting procedures based on long-term method detection levels and some considerations for interpretations of water-quality data provided by the U.S. Geological Survey: U.S. Geological Survey Open-File Report 99–193, 19 p., https://water.usgs.gov/owq/OFR_99-193/index.html.

Examples

data(swData)
# scatter plot of Simazine concentrations
cenScatPlot(IllRivValleyCty, pname = "04035")
# scatter plot with many additional plotting arguments
par(las = 1, tcl = 0.5)
cenScatPlot(IllRivValleyCty, pname = "04035", 
            site = "05586100 Illinois River at Valley City, IL",
            ylabel = "Simazine concentration, in micrograms per liter", 
            legcex = 0.7, ylim = c(0, 0.4), yaxs = "i", cex.lab = 0.9, 
            cex.axis = 0.9, xlim = c(as.Date("1996-01-01", "%Y-%m-%d"), 
            as.Date("2012-01-01", "%Y-%m-%d")), xaxs = "i", xaxt = "n")
axdates <- c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01",
           "2012-01-01")
axis(1, as.Date(axdates, "%Y-%m-%d"), 
     labels = c("1996", "2000", "2004", "2008", "2012"), cex.axis = 0.9)
data(swData)
# scatter plot of Simazine concentrations
cenScatPlot(IllRivValleyCty, pname = "04035")
# scatter plot with many additional plotting arguments
par(las = 1, tcl = 0.5)
cenScatPlot(IllRivValleyCty, pname = "04035", 
            site = "05586100 Illinois River at Valley City, IL",
            ylabel = "Simazine concentration, in micrograms per liter", 
            legcex = 0.7, ylim = c(0, 0.4), yaxs = "i", cex.lab = 0.9, 
            cex.axis = 0.9, xlim = c(as.Date("1996-01-01", "%Y-%m-%d"), 
            as.Date("2012-01-01", "%Y-%m-%d")), xaxs = "i", xaxt = "n")
axdates <- c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01",
           "2012-01-01")
axis(1, as.Date(axdates, "%Y-%m-%d"), 
     labels = c("1996", "2000", "2004", "2008", "2012"), cex.axis = 0.9)

Combine water-quality sample data and continuous ancillary variables

Description

Function to combine water-quality sample data and continuous (daily) ancillary variables and drop unnecessary columns.

Usage

combineData(qwdat, cqwdat, qwcols = c("staid", "dates", "R", "P"))
combineData(qwdat, cqwdat, qwcols = c("staid", "dates", "R", "P"))

Arguments

`qwdat`	is the dataset containing water-quality sample data with columns that begin with a P (or other user-defined indicator) followed by alphanumeric characters. These columns are concentration data. In addition there need to be columns that begin with an R (or other user- defined indicator) followed by alphanumeric characters that match those of the associated concentration data. The R columns contain data qualification codes. See example datasets for more information about the data format, IllRivValleyCty and qwMoRivOmaha. This package assumes that the data are in micrograms per liter.
`cqwdat`	is the dataset containing variables that can be used as explanatory variables for the seawaveQ model. See example dataset for more information about the data format cqwMoRivOmaha. These are daily values with no missing values allowed between the first and the last date in the dataset.
`qwcols`	is a character vector with column headings for a station (location) identifier, a dates column identifier, beginning of column headers for remarks code (default is R), and beginning of column headers for concentration data (default is P for parameter).

Format

A data frame with the number of rows equal to the number of rows in the data frame indicated by qwdat. The number of columns depend on the two input data frames. Minimally there will be a station identification column, a dates column, a column of qualification codes, and a column of water-quality data.

Value

A data frame

Note

The columns indicated by qwcols[1:2] are used to combine the datasets. The first column is the station identifier and the second column is the date column. These two column headings must be the same in the two datasets being combined and the dates in the datasets being combined must be of class Date and use the same format.

Author(s)

Karen R. Ryberg and Aldo V. Vecchia

Examples

data(swData)
MoRivOmaha <- combineData(qwdat = qwMoRivOmaha, cqwdat = cqwMoRivOmaha, 
qwcols = c("staid", "dates", "R", "P"))
data(swData)
MoRivOmaha <- combineData(qwdat = qwMoRivOmaha, cqwdat = cqwMoRivOmaha, 
qwcols = c("staid", "dates", "R", "P"))

Seasonal Wave Computation

Description

Function to compute seasonal wave.

Usage

compwaveconv(cmaxt, jmod, hlife)
compwaveconv(cmaxt, jmod, hlife)

Arguments

`cmaxt`	is the time of the maximum chemical concentration, decimal time in years.
`jmod`	is the choice of model or pulse input function, an integer 1 through 14.
`hlife`	is the model half-life in months, 1 to 4 months

Value

A numeric vector of size 361 with discrete values of the seasonal wave for decimal season seq(0,1,1/360).

Note

The seasonal wave is a dimensionless, periodic function of time with an annual cycle, similar to a mixture of sine and cosine functions often used to model seasonality in concentration data. However, the seasonal wave is better suited for modeling seasonal behavior of pesticide data than a mixture of sines and cosines. The pulse input function, represented by jmod, has either one or two distinct application seasons (when pesticides may be transported to the stream) of lengths from 1 to 6 months. Therefore, 56 (14x4) choices for the wave function are available. The numeric vector is a discrete approximation of the continuous wave function defined on the interval 0 to 1.

Author(s)

Aldo V. Vecchia

References

Vecchia, A.V., Martin, J.D., and Gilliom, R.J., 2008, Modeling variability and trends in pesticide concentrations in streams: JAWRA Journal of the American Water Resources Association, v. 44, p. 1308–1324, https://doi.org/10.1111/j.1752-1688.2008.00225.x.

Examples

# evaluate seasonal wave for specified decimal seasons
# these example decimal dates represent days at points 0.25, 0.5, and 
# 0.75 percent of the way through the year and the end of the year
dseas <- c(0.25, 0.5, 0.75, 1)
swave <- compwaveconv(cmaxt = 0.483, jmod = 2, hlife = 4)
swave[floor(360 * dseas)]
plot(seq(0, 1, 1/360), swave, typ = "l")
# evaluate seasonal wave for specified decimal seasons
# these example decimal dates represent days at points 0.25, 0.5, and 
# 0.75 percent of the way through the year and the end of the year
dseas <- c(0.25, 0.5, 0.75, 1)
swave <- compwaveconv(cmaxt = 0.483, jmod = 2, hlife = 4)
swave[floor(360 * dseas)]
plot(seq(0, 1, 1/360), swave, typ = "l")

Continuously monitored (daily) data for 06610000 Missouri River at Omaha, Neb.

Description

Continuously monitored (daily) streamflow and sediment data for U.S. Geological Survey streamgage 06610000 Missouri River at Omaha, Neb., and streamflow and sediment anomalies.

Usage

cqwMoRivOmahacqwMoRivOmaha

Format

A data frame containing 2,922 daily observations of two hydrologic variables, streamflow and sediment, and streamflow and sediment anomalies. There are eight variables.

staid	character	USGS Station identification number
dates	date	date water-quality sample collected
dflow	numeric	daily mean streamflow, cubic feet per second
flowa30	numeric	30-day streamflow anomaly
flowa1	numeric	1-day streamflow anomaly
dsed	numeric	daily mean sediment concentration, milligrams per liter
seda30	numeric	30-day sediment anomaly
seda1	numeric	1-day sediment anomaly

Details

The streamflow and sediment anomalies were generated using the R package waterData (Ryberg and Vecchia, 2012).

Note

See Ryberg and Vecchia (2012) for more information on calculating the anomalies and for additional references documenting the use of streamflow anomalies in water-quality trend analysis studies.

Source

Data provided by Patrick Phillips, U.S. Geological Survey, New York Water Science Center.

References

Ryberg, K.R., and Vecchia, A.V., 2012, waterData—An R package for retrieval, analysis, and anomaly calculation of daily hydrologic time series data, version 1.0.0: U.S. Geological Survey Open-File Report 2012–1168, 8 p. [(]Also available at http://pubs.usgs.gov/of/2012/1168/.]

Examples

  data(swData)
  
  # summary of water-quality concentrations
  apply(cqwMoRivOmaha[,3:8], 2, summary)
data(swData)
  
  # summary of water-quality concentrations
  apply(cqwMoRivOmaha[,3:8], 2, summary)

Example continuous ancillary variable data.

Description

This is an example of the continuous ancillary data that is passed internally to subfunctions of fitswavecav. It is provided here for use with examples of internal functions.

Usage

examplecavdatexamplecavdat

Format

A data frame containing 2,893 data variables and 30-day and 1-day streamflow anomalies (Ryberg and Vecchia, 2012).

yrx	numeric	Year
mox	numeric	Month
dax	numeric	Day
jdayx	numeric	Julian day from first day water year for start year in `fitswavecav`
flowa30	numeric	30-day streamflow anomaly
flowa1	numeric	1-day streamflow anomaly

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

References

Ryberg, K.R., and Vecchia, A.V., 2012, waterData–An R package for retrieval, analysis, and anomaly calculation of daily hydrologic time series data, version 1.0: U.S. Geological Survey Open-File Report 2012–1168, 8 p. [Also available at http://pubs.usgs.gov/of/2012/1168/.]

Examples

data(swData)
head(examplecavdat)
data(swData)
head(examplecavdat)

Example continuous ancillary variable matrix.

Description

This is an example of the continuous ancillary matrix that is passed internally to subfunctions of fitswavecav. It is provided here for use with examples of internal functions.

Usage

examplecavmatexamplecavmat

Format

A matrix containing 115 30-day and 1-day streamflow anomalies (Ryberg and Vecchia, 2012).

flowa30	numeric	30-day streamflow anomaly
flowa1	numeric	1-day streamflow anomaly

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
head(examplecavmat)
data(swData)
head(examplecavmat)

Example water-quality data.

Description

This is an example of the water-quality data that is passed internally to subfunctions of fitswavecav. It is provided here for use with examples of internal functions.

Usage

examplecdatsubexamplecdatsub

Format

A data frame containing 115 observations of 10 variables. The date variables were internally calculated. The columns R04041 and P04041 are a subset of qwMoRivOmaha and the 30-day and 1-day streamflow and sediment anomalies are a subset of cqwMoRivOmaha.

yrc	numeric	Year
moc	numeric	Month
dac	numeric	Day
jdayc	numeric	Julian day from first day of start year in `fitswavecav`
flowa30	numeric	30-day streamflow anomaly
flowa1	numeric	1-day streamflow anomaly
seda30	numeric	30-day sediment anomaly
seda1	numeric	1-day sediment anomaly

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

  data(swData)
  head(examplecdatsub)
data(swData)
  head(examplecdatsub)

Example logical vector.

Description

This is an example of data that is passed internally to subfunctions of fitswavecav. This logical vector indicates which water-quality values are censored. It is provided here for use with examples of the internal functions.

Usage

examplecentmpexamplecentmp

Format

A logical vector of 115 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
examplecentmp
data(swData)
examplecentmp

Example of logarithmically transformed concentration data.

Description

This is an example of data that is used internally by fitMod and passed to its subfunction seawaveQPlots. This numeric vector represents the base-10 logarithm of the water-quality concentrations. It is provided here for use with examples of the internal functions.

Usage

exampleclogexampleclog

Format

A numeric vector of 115 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
exampleclog
data(swData)
exampleclog

Example data indicators.

Description

This is an example of the character vector used to indicate which columns represent qualification codes and which represent water-quality concentration data. It is provided here for use with examples of the internal functions.

Usage

exampleqwcolsexampleqwcols

Format

A numeric vector of 115 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

  data(swData)
  exampleqwcols
data(swData)
  exampleqwcols

Example matrix for internal use.

Description

This is an example of data that is passed internally to subfunctions of fitswavecav. It is provided here for use with examples of the internal functions.

Usage

examplestparsexamplestpars

Format

A numeric matrix of two rows and 15 columns.
column description
1 mclass, model class, 1 for linear models, 2 for those with restricted cubic splines
2 model chosen (a number 1-56), this number represents both the pulse input function and the half-life
3 is the scale factor from the survreg.object
4 is the likelihood for the model chosen
5 is the coefficient for the model intercept
6 is the coefficient for the seasonal wave component of the model
7 is the coefficient for the trend component of the model
8 is the coefficient for the 30-day flow anomaly
9 is the coefficient for the 1-day flow anomaly
10 is the standard error for the intercept term
11 is the standard error for the seasonal wave term
12 is the standard error for the trend term
13 is the standard error for the 30-day flow anomaly term
14 is the standard error for the 1-day flow anomaly term
15 is cmaxt, the decimal season of maximum concentration

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

  data(swData)
  examplestpars
data(swData)
  examplestpars

Example numeric vector used internally.

Description

This is an example of data that is passed internally to seawaveQPlots. This numeric vector contains trend coefficients for the water-quality samples. It is provided here for use with examples of the internal functions.

Usage

exampletndlinexampletndlin

Format

A numeric vector of 115 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha,
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
head(exampletndlin)
data(swData)
head(exampletndlin)

Example numeric vector used internally.

Description

This is an example of data that is passed internally to seawaveQPlots. This numeric vector contains trend coefficients for a continuous water-quality prediction based on the continuous ancillary variables. It is provided here for use with examples of the internal functions.

Usage

exampletndlinprexampletndlinpr

Format

A numeric vector of 2,893 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
head(exampletndlinpr)
data(swData)
head(exampletndlinpr)

Example numeric vector used internally.

Description

This is an example of data that is passed internally to seawaveQPlots. This numeric vector contains decimal seasonal (0-1) values for the water-quality samples. It is provided here for use with examples of the internal functions.

Usage

exampletseasexampletseas

Format

A numeric vector of 115 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
head(exampletseas)
data(swData)
head(exampletseas)

Example numeric vector used internally.

Description

This is an example of data that is passed internally to seawaveQPlots. This numeric vector contains decimal seasonal (0-1) values for the continuous ancillary data. It is provided here for use with examples of the internal functions.

Usage

exampletseasprexampletseaspr

Format

A numeric vector of 2,893 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
head(exampletseaspr)
data(swData)
head(exampletseaspr)

Example numeric vector used internally.

Description

This is an example of data that is passed internally to seawaveQPlots. This numeric vector contains decimal dates for the water-quality samples. It is provided here for use with examples of the internal functions.

Usage

exampletyrexampletyr

Format

A numeric vector of 115 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
head(exampletyr)
data(swData)
head(exampletyr)

Example numeric vector used internally.

Description

This is an example of data that is passed internally to seawaveQPlots. This numeric vector contains decimal dates for continuous ancillary variables. It is provided here for use with examples of the internal functions.

Usage

exampletyrprexampletyrpr

Format

A numeric vector of 2,893 observations.

Source

Internal data captured from the following function call:

fitswavecav(cdat=modMoRivOmaha, cavdat=cqwMoRivOmaha, 
                       tanm="myexample", pnames=c("04041"), yrstart=1995, 
                       yrend=2003, tndbeg=1995, tndend=2003, 
                       iwcav=c("flowa30","flowa1"), dcol="dates", 
                       qwcols=c("R","P"))

Examples

data(swData)
head(exampletyrpr)
data(swData)
head(exampletyrpr)

Internal function that fits the seawaveQ model.

Description

fitMod is called from within fitswavecav but can be invoked directly. It fits the seawaveQ model and returns the results.

Usage

fitMod(
  cdatsub,
  cavdat,
  yrstart,
  yrend,
  tndbeg,
  tndend,
  tanm,
  pnames,
  qwcols,
  mclass = 1,
  numk = 4,
  plotfile = FALSE,
  textfile = FALSE
)
fitMod(
  cdatsub,
  cavdat,
  yrstart,
  yrend,
  tndbeg,
  tndend,
  tanm,
  pnames,
  qwcols,
  mclass = 1,
  numk = 4,
  plotfile = FALSE,
  textfile = FALSE
)

Arguments

`cdatsub`	is the concentration data.
`cavdat`	is the continuous (daily) ancillary data.
`yrstart`	is the starting year of the analysis (treated as January 1 of that year).
`yrend`	is the ending year of the analysis (treated as December 31 of that year).
`tndbeg`	is the beginning (in whole or decimal years) of the trend period.
`tndend`	is the end (in whole or decimal years) of the trend period.
`tanm`	is a character identifier that names the trend analysis run. It is used to label output files.
`pnames`	is the parameter (water-quality constituents) to analyze (if using USGS parameters, omit the starting 'P', such as "00945" for sulfate).
`qwcols`	is a character vector with the beginning of the column headers for remarks code (default is R), and beginning of column headers for concentration data (default is P for parameter).
`mclass`	indicates the class of model to use. A class 1 model is the the traditional SEAWAVE-Q model that has a linear time trend. A class 2 model is a newer option for longer trend periods that uses a set of restricted cubic splines on the time variable to provide a more flexible model. The default is 1. (Harrell, 2010, 2018).
`numk`	is the number of knots in the restricted cubic spline model (mclass = 2). The default is 4, and the recommended number is 3–7.
`plotfile`	is by default FALSE. True will write pdf files of plots to the user's file system.
`textfile`	is by default FALSE. True will write text output files to the user's file system. These files are useful for detailed model comparisons, documenting session information, and for model archives.

Value

A PDF file (if plotfile is TRUE) containing plots (see seawaveQPlots), a text file showing the best model survival regression call and results, and a list. The first element of the list contains information about the data and the model(s) selected (see examplestpars). The second element of the list contains the summary of the survival regression call. The third element of the list is itself a list containing the observed concentrations (censored and uncensored) and the predicted concentrations used by seawaveQPlots or seawaveQPlots2 to generate the plots.

Author(s)

Aldo V. Vecchia and Karen R. Ryberg

References

Allison, P.D., 1995, Survival analysis using the SAS system—A practical guide: Cary, N.C., SAS Institute, Inc., 304 p.

Harrell, F.E., Jr., 2010, Regression modeling strategies—With applications to linear models, logistic regression, and survival analysis: New York, Springer-Verlag, 568 p.

Harrell, F.E., Jr., 2018, rms—Regression modeling strategies: R package version 5.1-2, https://CRAN.R-project.org/package=rms.

Examples

data(swData)
myRes <- fitMod(cdatsub = examplecdatsub, cavdat = examplecavdat, 
yrstart = 1995, yrend = 2003, tndbeg = 1995, tndend = 2003, tanm = "myfit3", 
pnames = c("04041"), qwcols = c("R", "P"), plotfile = FALSE,
textfile = FALSE)
data(swData)
myRes <- fitMod(cdatsub = examplecdatsub, cavdat = examplecavdat, 
yrstart = 1995, yrend = 2003, tndbeg = 1995, tndend = 2003, tanm = "myfit3", 
pnames = c("04041"), qwcols = c("R", "P"), plotfile = FALSE,
textfile = FALSE)

Fit seasonal wave and continuous ancillary data for trend analysis

Description

Function to prepare data and fit the seawaveQ model.

Usage

fitswavecav(
  cdat,
  cavdat,
  tanm = "trend1",
  pnames,
  yrstart = 0,
  yrend = 0,
  tndbeg = 0,
  tndend = 0,
  iwcav = c("none"),
  dcol = "dates",
  qwcols = c("R", "P"),
  mclass = 1,
  numk = 4,
  alpha = 0.1,
  bootRCS = FALSE,
  nboot = 1000,
  plotfile = FALSE,
  textfile = FALSE
)
fitswavecav(
  cdat,
  cavdat,
  tanm = "trend1",
  pnames,
  yrstart = 0,
  yrend = 0,
  tndbeg = 0,
  tndend = 0,
  iwcav = c("none"),
  dcol = "dates",
  qwcols = c("R", "P"),
  mclass = 1,
  numk = 4,
  alpha = 0.1,
  bootRCS = FALSE,
  nboot = 1000,
  plotfile = FALSE,
  textfile = FALSE
)

Arguments

`cdat`	is the concentration data
`cavdat`	is the continuous (daily) ancillary data
`tanm`	is a character identifier that names the trend analysis run. It is used to label output files.
`pnames`	are the parameters (water-quality constituents) to analyze (omit the starting character, for example for sulfate data indicated by P00945, enter "00945").
`yrstart`	is the starting year of the analysis (treated as January 1 of that year). Zero means the start date will be determined by the start date of cavdat, the continuous ancillary data.
`yrend`	is the ending year of the analysis (treated as December 31 of that year). Zero means the end date will be determined by the end date of cavdat, the continuous ancillary data.
`tndbeg`	is the beginning (in whole or decimal years) of the trend period. Zero means the begin date will be the beginning of the concentration data, cdat.
`tndend`	is the end of the trend (treated as December 31 of that year). Zero means the end date will be the end of the concentration data, cdat.
`iwcav`	is a character vector indicating which continuous ancillary variables to include, if none are used for analysis, use iwcav=c("none").
`dcol`	is the column name for the dates, should be the same for both cdat and cavdat
`qwcols`	is a character vector with the beginning of the column headers for remarks code (default is R), and beginning of column headers for concentration data (default is P for parameter).
`mclass`	indicates the class of model to use. A class 1 model is the traditional SEAWAVE-Q model that has a linear time trend. A class 2 model is a newer option for longer trend periods that uses a set of restricted cubic splines on the time variable to provide a more flexible model.
`numk`	is the number of knots in the restricted cubic spline model. The default is 4, and the recommended number is 3–7.
`alpha`	is the significance level or alpha values for statistical significance and confidence intervals.
`bootRCS`	is a logical value indicating whether or not to perform block bootstrapping for an attained significance level for the trend with restricted cubic splines. No bootstrapping is performed for the linear trend model.
`nboot`	is the number of bootstrap replicates. A large number, 10,000, is recommended, but this takes a long time.
`plotfile`	is by default FALSE. True will write pdf files of plots to the user's file system.
`textfile`	is by default FALSE. True will write text output files to the user's file system. These files are useful for detailed model comparisons, documenting session information, and for model archives.

Format

The data frame returned as the first element of the output list has one row for each chemical analyzed and the number of columns depends on the number of continuous ancillary variables used. The general format is as follows:

pname	character	parameter analyzed
mclass	numeric	a value of 1 or 2
jmod	numeric	the choice of pulse input function, an integer 1--14
hlife	numeric	the model half-life in months, an integer, 1 to 4 months
cmaxt	numeric	the decimal season of maximum concentration
scl	numeric	the scale factor from the `survreg.object`
loglik	numeric	the log-likelihood for the model
cint	numeric	coefficient for model intercept
cwave	numeric	coefficient for the seasonal wave
ctnd[alpahnumeric]	numeric	coefficient(s) for the trend component(s) of model
c[alphanumeric]	numeric	0 or more coefficients for the continuous ancillary variables
seint	numeric	standard error for the intercept
sewave	numeric	standard error for the seasonal wave
setnd[alphanumeric]	numeric	standard error for the trend component(s)
se[alphanumeric]	numeric	0 or more standard errors for the continuous ancillary variables
pvaltnd[alphanumeric]	numeric	the p-value for the trend component(s)

The data frame returned as the sixth element of the output list has one row for each chemical analyzed. The general format for linear models is described in pesticideTrendCalcs. The format for restricted cubic spline models is as described as follows:

pname	character	parameter analyzed
mclass	numeric	A value of 1 or 2
baseConc	numeric	the concentration at the beginning of the trend period
endCon	numeric	the concentration at the end of the trend period
rcsctndPpor	numeric	the concentration trend in percent over the trend period
rcsctndOrigPORPercentBase	numeric	the conc. trend in original units over period of record (calc. based on percent per year and base conc.)
pvalrcstnd	numeric	p-value, attained significance level, based on bootstrapping
ctndlklhd	numeric	trend likelihood

Details

Fits the seawaveQ model (Vecchia and others, 2008) using a seasonal wave and continuous ancillary variables (streamflow anomalies and other continuous variables such as conductivity or sediment) to model water quality. The version in the 2.0.0 update to the R package has an option to use restricted cubic splines as a more flexible definition of the temporal trend.

Value

A PDF file containing plots of the data and modeled concentration, a text file containing a summary of the survival regression call for each model selected, and a list. The first element of the list is a data frame described under format. The second element of the list is the summary of the survival regression call. The third element is the observed concentration data (censored and uncensored). The fourth element is the concentration data predicted by the model. The fifth element provides summary statistics for the predicted concentrations. The sixth element is a data frame that provides a summary of the trends with the columns pname (parameter name), mclass (a value of 1, indicating a linear trend model, a value of 2 for models using restricted cubic splines), and the columns describing the trends. For linear models see pesticideTrendCalcs. For models with restricted cubic splines, see the format section. See Ryberg and York (2020) for additional details.

Note

The assumed data format is one with columns for water-quality concentration values and a related column for qualification of those values, such as in the case of left-censored values less than a particular value. For example, a water-quality sample was collected and the laboratory analysis indicated that the concentration was less than 0.01 micrograms per liter. The USGS parameter code for simazine is 04035 (U.S. Geological Survey, 2018b). When the data are retrieved through the National Water Information System: Web Interface (https://waterdata.usgs.gov/nwis; U.S. Geological Survey, 2018a), the concentration values are in a column labeled P04035 and the qualification information, or remark codes, are in a column labeled R04035. To use this function, the argument pnames would be the unique identifier for simazine values and qualifications, 04035, and the qwcols argument would be c("R", "P") to indicate that the qualification column starts with an R and the values column starts with a P.

Other users may have data in different formats that can be modified to use with this function. For example, a user may have concentration values and qualification codes in one column, such as a column labeled simazine with the values 0.05, 0.10, <0.01, <0.01, and 0.90. In this case, the less thans and any other qualification codes should be placed in a separate column. The column names for the qualification codes and the concentration values should be the same with the exception of different beginning letters to indicate which column is which. The columns could be named Rsimazine and Psimazine. Then the argument pnames = "simazine" and the argument qwcols = c("R", "P").

Users should exercise caution when their water-quality data have multiple censoring limits and may want to recensor the data to a single censoring level. Censoring and recensoring issues are discussed in the text and Appendix 1 of Ryberg and others (2010).

None of the variations of SEAWAVE is a simple model. The model complexity increases with flow anomalies and with the addition of restricted cubic splines. As the number of parameters in the model increases or the degree of censoring increases, the sample size must also increase. See Ryberg and others (2020) for more details on sample size.

Author(s)

Aldo V. Vecchia and Karen R. Ryberg

References

Ryberg, K.R. and York, B.C., 2020, seawaveQ—An R package providing a model and utilities for analyzing trends in chemical concentrations in streams with a seasonal wave (seawave) and adjustment for streamflow (Q) and other ancillary variables: U.S. Geological Survey Open-File Report 2020–1082, 25 p., with 4 appendixes.

Ryberg, K.R., Vecchia, A.V., Martin, J.D., and Gilliom, R.J., 2010, Trends in pesticide concentrations in urban streams in the United States, 1992–2008: U.S. Geological Survey Scientific Investigations Report 2010–5139, 101 p., https://pubs.usgs.gov/sir/2010/5139/.

U.S. Geological Survey, 2018a, National Water Information System: Web Interface, accessed July 7, 2018, at https://waterdata.usgs.gov/nwis.

U.S. Geological Survey, 2018b, Parameter code definition: National Water Information System: Web Interface, accessed July 18, 2018, at https://nwis.waterdata.usgs.gov/usa/nwis/pmcodes.

Examples

data(swData)
modMoRivOmaha <- combineData(qwdat = qwMoRivOmaha, cqwdat = cqwMoRivOmaha)
myfitLinearTrend <- fitswavecav(cdat = modMoRivOmaha, cavdat = cqwMoRivOmaha, 
tanm = "myfitLinearTrend", pnames = c("04035", "04037", "04041"), yrstart = 1995, 
yrend = 2003, tndbeg = 1995, tndend = 2003, iwcav = c("flowa30", "flowa1"), 
dcol = "dates", qwcols = c("R", "P"))
# trend model results
myfitLinearTrend[[1]]
# example regression call
myfitLinearTrend[[2]][[1]]
# first few lines of observed concentrations
head(myfitLinearTrend[[3]])
# first few lines of predicted concentrations
head(myfitLinearTrend[[4]])
# summary statistics for predicted concentrations
myfitLinearTrend[[5]]
# summary of trends
myfitLinearTrend[[6]]
myfitRCSTrend <- fitswavecav(cdat = modMoRivOmaha, cavdat = cqwMoRivOmaha, 
tanm = "myfitRCSTrend", pnames = c("04035", "04037", "04041"), yrstart = 1995, 
yrend = 2003, tndbeg = 1995, tndend = 2003, iwcav = c("flowa30", "flowa1"), 
dcol = "dates", qwcols = c("R", "P"), mclass = 2, numk = 4, bootRCS = FALSE,
plotfile = FALSE, textfile = FALSE)
data(swData)
modMoRivOmaha <- combineData(qwdat = qwMoRivOmaha, cqwdat = cqwMoRivOmaha)
myfitLinearTrend <- fitswavecav(cdat = modMoRivOmaha, cavdat = cqwMoRivOmaha, 
tanm = "myfitLinearTrend", pnames = c("04035", "04037", "04041"), yrstart = 1995, 
yrend = 2003, tndbeg = 1995, tndend = 2003, iwcav = c("flowa30", "flowa1"), 
dcol = "dates", qwcols = c("R", "P"))
# trend model results
myfitLinearTrend[[1]]
# example regression call
myfitLinearTrend[[2]][[1]]
# first few lines of observed concentrations
head(myfitLinearTrend[[3]])
# first few lines of predicted concentrations
head(myfitLinearTrend[[4]])
# summary statistics for predicted concentrations
myfitLinearTrend[[5]]
# summary of trends
myfitLinearTrend[[6]]
myfitRCSTrend <- fitswavecav(cdat = modMoRivOmaha, cavdat = cqwMoRivOmaha, 
tanm = "myfitRCSTrend", pnames = c("04035", "04037", "04041"), yrstart = 1995, 
yrend = 2003, tndbeg = 1995, tndend = 2003, iwcav = c("flowa30", "flowa1"), 
dcol = "dates", qwcols = c("R", "P"), mclass = 2, numk = 4, bootRCS = FALSE,
plotfile = FALSE, textfile = FALSE)

Water-quality data for 05586100 Illinois River at Valley City, Ill.

Description

Scatterplots of water-quality data for 05586100 Illinois River at Valley City, Ill. Simazine concentrations in the
Illinois River at Valley City, Ill.
Prometon concentrations in the
Illinois River at Valley City, Ill.
Metribuzin concentrations in the
Illinois River at Valley City, Ill.
EPTC concentrations in the
Illinois River at Valley City, Ill.

Usage

IllRivValleyCtyIllRivValleyCty

Format

A data frame containing 168 water-quality samples for 4 constituents. There are 20 variables.

site	character	Site abbreviation for study
staid	character	USGS Station identification number
dates	date	Date water-quality sample collected
yrc	numeric	Year
moc	numeric	Month
dac	numeric	Day
jdayc	numeric	Julian day from first day of associated streamflow data used
R04035	character	Remark code (blank, _, <, or E)
P04035	numeric	Simazine, water, filtered, recoverable, micrograms per liter
R04037	character	Remark code (blank, _, <, or E)
P04037	numeric	Prometon, water, filtered, recoverable, micrograms per liter
R82630	character	Remark code (blank, _, <, or E)
P82630	numeric	Metribuzin, water, filtered, recoverable, micrograms per liter
R82668	character	Remark code (blank, _, <, or E)
P82668	numeric	EPTC, water, filtered (0.7 micron glass fiber filter), recoverable, micrograms per liter
dflow	numeric	Streamflow, cubic feet per second
flowa30	numeric	30-day streamflow anomaly
flowa1	numeric	1-day streamflow anomaly
dsed	numeric	Sediment concentration, milligrams per liter
seda30	numeric	30-day sediment anomaly
seda1	numeric	1-day sediment anomaly

Details

Chemical concentration data are in the columns that start with a P and are followed by a number. Qualification codes for the concentration data are in the columns that start with an R followed by the same numbers as the associated concentration data. For example, column P04035 indicates simazine data, 04035, being the U.S. Geological Survey parameter code for simazine. The qualification codes for the simazine concentrations are found in the column R04035, indicating a U.S. Geological Survey remark code. Remark codes include _ or nothing, indicating no qualification of the value in the associated concentration field; <, indicating a censored value that is less than the number reported in the associated concentration field; and E, indicating that the value has been estimated. See Oblinger Childress and others (1999) for information on the remark codes used by the U.S. Geological Survey. The streamflow and sediment anomalies were generated using the R package waterData (Ryberg and Vecchia, 2012).

Source

Data provided by Patrick Phillips, U.S. Geological Survey, New York Water Science Center.

References

Oblinger Childress, C.J., Foreman, W.T., Connor, B.F., and Maloney, T.J., 1999, New reporting procedures based on long-term method detection levels and some considerations for interpretations of water-quality data provided by the U.S. Geological Survey Open-File Report 99–193, 19 p. [Also available at https://water.usgs.gov/owq/OFR_99-193/index.html.]

Examples

data(swData)

# summary of water-quality concentrations
apply(IllRivValleyCty[, grep("P[[:digit:]]", 
dimnames(IllRivValleyCty)[[2]])], 2, summary)

# scatter plot of simazine concentrations
cenScatPlot(IllRivValleyCty, pname = "04035")

# Simazine scatter plot with many additional plotting arguments

par(las = 1, tcl = 0.5)
cenScatPlot(IllRivValleyCty, pname = "04035", 
            site = "05586100 Illinois River at Valley City, Ill.",
            ylabel = "Simazine concentration, in micrograms per liter", 
            legcex = 0.7, 
            ylim = c(0, 0.4), yaxs = "i", 
            xlim = c(as.Date("1996-01-01"), as.Date("2012-01-01")), 
            xaxs = "i", xaxt = "n")
axdates <- c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01",
           "2012-01-01")
axis(1, as.Date(axdates), labels = c("1996", "2000", "2004", "2008", "2012"))

# Prometon scatter plot
cenScatPlot(IllRivValleyCty, pname = "04037", 
            site = "05586100 Illinois River at Valley City, Ill.",
            ylabel = "Prometon concentration, in micrograms per liter", 
            legcex = 0.7, 
            ylim = c(0, 0.06), yaxs = "i", 
            xlim=c(as.Date("1996-01-01"), as.Date("2012-01-01")), xaxs = "i", 
            xaxt = "n")
axdates<-c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01", "2012-01-01")
axis(1, as.Date(axdates), labels = c("1996", "2000", "2004", "2008","2012"))

# Metribuzin scatter plot
cenScatPlot(IllRivValleyCty, pname="82630", 
            site="05586100 Illinois River at Valley City, Ill.",
            ylabel="Metribuzin concentration, in micrograms per liter", 
            legcex=0.7, 
            ylim=c(0,0.3), yaxs="i", cex.lab=0.9, cex.axis=0.9,
            xlim=c(as.Date("1996-01-01"), 
                   as.Date("2012-01-01")), xaxs="i", 
            xaxt="n")
axdates<-c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01", 
           "2012-01-01")
axis(1, as.Date(axdates), labels=c("1996", "2000", "2004", "2008",
                                               "2012"), cex.axis=0.9)
# EPTC scatter plot
cenScatPlot(IllRivValleyCty, pname = "82668", 
            site = "05586100 Illinois River at Valley City, Ill.",
            ylabel = "EPTC concentration, in micrograms per liter", 
            legcex = 0.7, ylim = c(0, 0.08), yaxs = "i", 
            xlim=c(as.Date("1996-01-01"), as.Date("2012-01-01")), 
            xaxs = "i", xaxt = "n")
axdates <- c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01", 
           "2012-01-01")
axis(1, as.Date(axdates), labels = c("1996", "2000", "2004", "2008","2012"))
data(swData)

# summary of water-quality concentrations
apply(IllRivValleyCty[, grep("P[[:digit:]]", 
dimnames(IllRivValleyCty)[[2]])], 2, summary)

# scatter plot of simazine concentrations
cenScatPlot(IllRivValleyCty, pname = "04035")

# Simazine scatter plot with many additional plotting arguments

par(las = 1, tcl = 0.5)
cenScatPlot(IllRivValleyCty, pname = "04035", 
            site = "05586100 Illinois River at Valley City, Ill.",
            ylabel = "Simazine concentration, in micrograms per liter", 
            legcex = 0.7, 
            ylim = c(0, 0.4), yaxs = "i", 
            xlim = c(as.Date("1996-01-01"), as.Date("2012-01-01")), 
            xaxs = "i", xaxt = "n")
axdates <- c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01",
           "2012-01-01")
axis(1, as.Date(axdates), labels = c("1996", "2000", "2004", "2008", "2012"))

# Prometon scatter plot
cenScatPlot(IllRivValleyCty, pname = "04037", 
            site = "05586100 Illinois River at Valley City, Ill.",
            ylabel = "Prometon concentration, in micrograms per liter", 
            legcex = 0.7, 
            ylim = c(0, 0.06), yaxs = "i", 
            xlim=c(as.Date("1996-01-01"), as.Date("2012-01-01")), xaxs = "i", 
            xaxt = "n")
axdates<-c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01", "2012-01-01")
axis(1, as.Date(axdates), labels = c("1996", "2000", "2004", "2008","2012"))

# Metribuzin scatter plot
cenScatPlot(IllRivValleyCty, pname="82630", 
            site="05586100 Illinois River at Valley City, Ill.",
            ylabel="Metribuzin concentration, in micrograms per liter", 
            legcex=0.7, 
            ylim=c(0,0.3), yaxs="i", cex.lab=0.9, cex.axis=0.9,
            xlim=c(as.Date("1996-01-01"), 
                   as.Date("2012-01-01")), xaxs="i", 
            xaxt="n")
axdates<-c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01", 
           "2012-01-01")
axis(1, as.Date(axdates), labels=c("1996", "2000", "2004", "2008",
                                               "2012"), cex.axis=0.9)
# EPTC scatter plot
cenScatPlot(IllRivValleyCty, pname = "82668", 
            site = "05586100 Illinois River at Valley City, Ill.",
            ylabel = "EPTC concentration, in micrograms per liter", 
            legcex = 0.7, ylim = c(0, 0.08), yaxs = "i", 
            xlim=c(as.Date("1996-01-01"), as.Date("2012-01-01")), 
            xaxs = "i", xaxt = "n")
axdates <- c("1996-01-01", "2000-01-01", "2004-01-01", "2008-01-01", 
           "2012-01-01")
axis(1, as.Date(axdates), labels = c("1996", "2000", "2004", "2008","2012"))

Calculate annual loads and summarize trends

Description

Function to calculate pesticide loads in kilograms per year and summarize trends.

Usage

loadCalculations(
  dailyDat,
  pestPredict,
  modRes,
  concTrends,
  yrtype = 1,
  alpha = 0.1
)
loadCalculations(
  dailyDat,
  pestPredict,
  modRes,
  concTrends,
  yrtype = 1,
  alpha = 0.1
)

Arguments

`dailyDat`	is the daily streamflow data in the form of a data frame with three columns representing a station ID, date, and streamflow.
`pestPredict`	is the continuous (daily) estimation of pesticide concentrations for one or more pesticides at a single site. This should be in the form of the fourth element of the list returned by `fitswavecav`.
`modRes`	is the first element of the list returned by `fitswavecav` and includes the scale parameter for one or more pesticide trend models at a single site. The scale parameter is used in the bias correction.
`concTrends`	the SEAWAVE-Q trend in flow-normalized annual load. Cannot be different (computationally) from the trend in flow-normalized annual concentration when there is no trend in flow (Oelsner and others, 2017).
`yrtype`	allows one to calculate annual loads based on a calendar year or a water year, where a water year is the 12-month period October 1 through September 30 designated by the calendar year in which it ends. A yrtype of 1 represents a calendar year and is the default because that is the way the original model was developed. A yrtype of 2 represents a water year.
`alpha`	is the significance level or alpha value for statistical significance and confidence intervals.

Format

The first data frame returned has one row for each pesticide-year at a particular site and four columns. The general format is as follows:

pstaid	character	the station identification number
pcode	character	the parameter code for which a load was calculated
year or wyear	numeric	the year or water year for which a load was calculated
load	numeric	the load in kilograms per year

The second data frame returned has one row for each pesticide at a particular site and 11 columns. The general format is as follows:

pcode	character	the parameter code for which a load trend was calculated
mclass	numeric	a value of 1 or 2
mclass	numeric	a value of 1 or 2
alpha	numeric	a significance level
ltndPpor	numeric	the load trend in percent over the period of record
luciPpor	numeric	the load upper confidence interval for the trend in
		percent over the period of record
llciPpor	numeric	the load lower confidence interval for the trend in
		percent over the period of record
baseLoad	numeric	the base load, the load for the first year of trend period
ltndOrigPORPercentBase	numeric	the load trend in original units over
		the period of record
		(calculation based on percent per year and base load)
luciOrigPORPercentBase	numeric	the load trend upper confidence interval
		for the trend in original units over the period of record
		(calculation based on percent per year and base load)
llciOrigPORPercentBase	numeric	the load trend lower confidence interval
		for the trend in original units over the period of record
		(calculation based on percent per year and base load)
ltndlklhd	numeric	is the load trend likelihood

Details

Parameter load (mass) is the product of water-quality concentration (a mass per volume) and an associated streamflow rate (volume per time). This function generates an annual time series of pesticide loads on either a calendar year basis or a water year basis and summarizes load trends.

Value

Two data frames, the first contains the annual loads, the second contains the trend summary.

Note

In this load calculation function, daily pesticide concentration estimates provided by the fitswavecav function are corrected for retransformation bias (the concentration model is built on the base-10 logarithm of concentration; therefore, a bias correction is required when transforming back to the original units) and then used to calculate daily loads. The bias correction is based on the quasi-maximum likelihood estimator (Cohn and others, 1989) that was developed for natural logarithms, with an adjustment for the base-10 logarithm of the concentration. To calculate loads, the bias-corrected concentration estimates (assumed to be in micrograms per liter) are multiplied by daily streamflow and a constant, 0.892998605, which converts the load units (micrograms per liter * cubic feet per second) to kilograms per year. Daily loads are summed to annual values. See page 70 and equation 26 of Oelsner and others (2017) for further details regarding the load calculation and bias correction. Users may modify this function to convert to units other than kilograms per year.

Author(s)

Karen R. Ryberg

References

Cohn, T.A., DeLong, L.L., Gilroy, E.J., Hirsch, R.M., and Wells, D.K., 1989, Estimating constituent loads: Water Resources Research, v. 25, no. 5, p. 937–942.

Oelsner, G.P., Sprague, L.A., Murphy, J.C., Zuellig, R.E., Johnson, H.M., Ryberg, K.R., Falcone, J.A., Stets, E.G., Vecchia, A.V., Riskin, M.L., De Cicco, L.A., Mills, T.J., and Farmer, W.H., 2017, Water-quality trends in the Nation's rivers and streams, 1972–2012—Data preparation, statistical methods, and trend results (ver. 2.0, October 2017): U.S. Geological Survey Scientific Investigations Report 2017–5006, 136 p., https://doi.org/10.3133/sir20175006.

Examples

data(swData)
modMoRivOmaha <- combineData(qwdat = qwMoRivOmaha, cqwdat = cqwMoRivOmaha)
myfit1 <- fitswavecav(cdat = modMoRivOmaha, cavdat = cqwMoRivOmaha, 
tanm = "myfit1", pnames = c("04035", "04037", "04041"), yrstart = 1995, 
yrend = 2003, tndbeg = 1995, tndend = 2003, iwcav = c("flowa30", "flowa1"), 
dcol = "dates", qwcols = c("R", "P"))
MoRivOmahaLoadsYr <- loadCalculations(cqwMoRivOmaha[, 1:3], myfit1[[4]], 
myfit1[[1]], myfit1[[6]])
MoRivOmahaLoadsYr
data(swData)
modMoRivOmaha <- combineData(qwdat = qwMoRivOmaha, cqwdat = cqwMoRivOmaha)
myfit1 <- fitswavecav(cdat = modMoRivOmaha, cavdat = cqwMoRivOmaha, 
tanm = "myfit1", pnames = c("04035", "04037", "04041"), yrstart = 1995, 
yrend = 2003, tndbeg = 1995, tndend = 2003, iwcav = c("flowa30", "flowa1"), 
dcol = "dates", qwcols = c("R", "P"))
MoRivOmahaLoadsYr <- loadCalculations(cqwMoRivOmaha[, 1:3], myfit1[[4]], 
myfit1[[1]], myfit1[[6]])
MoRivOmahaLoadsYr

Summarize linear trends

Description

Internal function to summarize the trend results.

Usage

pesticideTrendCalcs(
  tndbeg,
  tndend,
  ctnd,
  pval,
  alpha,
  setnd,
  scl,
  baseConc,
  mclass
)
pesticideTrendCalcs(
  tndbeg,
  tndend,
  ctnd,
  pval,
  alpha,
  setnd,
  scl,
  baseConc,
  mclass
)

Arguments

`tndbeg`	is the beginning (in whole or decimal years) of the trend period. Zero means the begin date will be the beginning of the concentration data, cdat.
`tndend`	is the end of the trend (treated as December 31 of that year). Zero means the end date will be the end of the concentration data, cdat.
`ctnd`	is the concentration trend, the coefficient on the time variable.
`pval`	is the p-value for the linear trend component.
`alpha`	is the significance level or alpha value for statistical significance and confidence intervals.
`setnd`	is the standard error for the linear trend component.
`scl`	is the scale factor from the `survreg.object`.
`baseConc`	is the base concentration, the median concentration (midpoint of the trend line) for the first year of the trend analysis .
`mclass`	indicates the class of model to use. A class 1 model is the the traditional SEAWAVE-Q model that has a linear time trend. A class 2 model is a newer option for longer trend periods that uses a set of restricted cubic splines on the time variable to provide a more flexible model.

Format

The data frame returned has one row for each chemical analyzed and the number of columns are defined as follows:

pname	character	parameter analyzed
mclass	numeric	a value of 1 or 2
alpha	numeric	a significance level
ctndPpor	numeric	the concentration trend in percent over the period of record
cuciPpor	numeric	the concentration upper confidence interval for the trend in
		percent over the period of record
clciPpor	numeric	the concentration lower confidence interval for the trend in
		percent over the period of record
baseConc	numeric	the base concentration, median concentration or midpoint of
		trend line, for first year of trend period
ctndOrigPORPercentBase	numeric	the concentration trend in original units over
		the period of record
		(calculation based on percent per year and base concentration)
cuciOrigPORPercentBase	numeric	the concentration trend upper confidence interval
		for the trend in original units over the period of record
		(calculation based on percent per year and base concentration)
clciOrigPORPercentBase	numeric	the concentration trend lower confidence interval
		for the trend in original units over the period of record
		(calculation based on percent per year and base concentration)
ctndlklhd	numeric	is the concentration trend likelihood

Details

pesticideTrendCalcs is called from within fitswavecav

Value

The data frame returned has one row for each chemical analyzed and summaries of the trend.

Note

Based on trend calculations used to display and summarize pesticide trends here https://nawqatrends.wim.usgs.gov/swtrends/. A likelihood value that is the functional equivalent of the two-sided p-value associated with the significance level of the trend was determined as follows: Likelihood = (1 - (p-value / 2)), where p-value is the p-value for the trend coefficient (Oelsner and others, 2017).

Author(s)

Karen R. Ryberg

References

Oelsner, G.P., Sprague, L.A., Murphy, J.C., Zuellig, R.E., Johnson, H.M., Ryberg, K.R., Falcone, J.A., Stets, E.G., Vecchia, A.V., Riskin, M.L., De Cicco, L.A., Mills, T.J., and Farmer, W.H., 2017, Water-quality trends in the Nation's rivers and streams, 1972–2012—Data preparation, statistical methods, and trend results (ver. 2.0, October 2017): U.S. Geological Survey Scientific Investigations Report 2017–5006, 136 p., https://doi.org/10.3133/sir20175006.

Prepares concentration data and continuous ancillary data

Description

prepData is usually called from within fitswavecav but can be invoked directly. It performs some date calculations, removes rows with missing values for concentration or continuous variables, and returns the concentration and continuous ancillary data to be used by fitswavecav and its other internal functions.

Usage

prepData(cdat, cavdat, yrstart, yrend, dcol, pnames, iwcav, qwcols)
prepData(cdat, cavdat, yrstart, yrend, dcol, pnames, iwcav, qwcols)

Arguments

`cdat`	is the concentration data.
`cavdat`	is the continuous (daily) ancillary data.
`yrstart`	is the starting year of the analysis (treated as January 1 of that year). Zero means the start date will be determined by the start date of cavdat, the continuous ancillary data.
`yrend`	is the ending year of the analysis (treated as December 31 of that year). Zero means the end date will be determined by the end date of cavdat, the continuous ancillary data.
`dcol`	is the column name for the dates, should be the same for both cdat and cavdat.
`pnames`	are the parameters (water-quality constituents) to analyze (if using USGS parameters, omit the starting 'P', such as "00945" for sulfate).
`iwcav`	is a character variable indicating which continuous ancillary variables to include, if none use iwcav=c("none").
`qwcols`	is a character vector with the beginning of the column headers for remarks code (default is R), and beginning of column headers for concentration data (default is P for parameter).

Value

A list. The first element is the concentration data with additional date information, missing values removed, and extra columns removed. The second element is the continuous ancillary data with additional date information, missing values removed, and extra columns removed.

Author(s)

Aldo V. Vecchia and Karen R. Ryberg

Examples

data(swData)
modMoRivOmaha<-combineData(qwdat=qwMoRivOmaha, cqwdat=cqwMoRivOmaha)
preppedDat <- prepData(modMoRivOmaha, cqwMoRivOmaha, yrstart=1995, 
yrend=2003, dcol="dates", pnames=c("04035", "04037", "04041"),  
iwcav=c("flowa30","flowa1"), qwcols=c("R","P"))
data(swData)
modMoRivOmaha<-combineData(qwdat=qwMoRivOmaha, cqwdat=cqwMoRivOmaha)
preppedDat <- prepData(modMoRivOmaha, cqwMoRivOmaha, yrstart=1995, 
yrend=2003, dcol="dates", pnames=c("04035", "04037", "04041"),  
iwcav=c("flowa30","flowa1"), qwcols=c("R","P"))

Water-quality data for 06610000 Missouri River at Omaha, Nebr.

Description

Scatterplots of water-quality data for 06610000 Missouri River at Omaha, Nebr.

Simazine concentrations in the
Missouri River at Omaha, Nebr.
Prometon concentrations in the
Missouri River at Omaha, Nebr.
Cyanazine concentrations in the
Missouri River at Omaha, Nebr.
Metolachlor concentrations in the
Missouri River at Omaha, Nebr.
Alachlor concentrations in the
Missouri River at Omaha, Nebr.
Metribuzin concentrations in the
Missouri River at Omaha, Nebr.
Trifluralin concentrations in the
Missouri River at Omaha, Nebr.
EPTC concentrations in the
Missouri River at Omaha, Nebr.

Usage

qwMoRivOmahaqwMoRivOmaha

Format

A data frame containing 115 water-quality samples for eight chemical constituents. There are 20 variables.

staid	character	USGS Station identification number
dates	date	Date water-quality sample collected
times	numeric	Time sample was collected
R04035	character	Remark code (blank, _, <, or E)
P04035	numeric	Simazine, water, filtered, recoverable, micrograms per liter
R04037	character	Remark code (blank, _, <, or E)
P04037	numeric	Prometon, water, filtered, recoverable, micrograms per liter
R04041	character	Remark code (blank, _, <, or E)
P04041	numeric	Cyanazine, water, filtered, recoverable, micrograms per liter
R39415	character	Remark code (blank, _, <, or E)
P39415	numeric	Metolachlor, water, filtered, recoverable, micrograms per liter
R46342	character	Remark code (blank, _, <, or E)
P46342	numeric	Alachlor, water, filtered, recoverable, micrograms per liter
R82630	character	Remark code (blank, _, <, or E)
P82630	numeric	Metribuzin, water, filtered, recoverable, micrograms per liter
R82661	character	Remark code (blank, _, <, or E)
P82661	numeric	Trifluralin, water, filtered (0.7 micron glass fiber filter), recoverable, micrograms per liter
R82668	character	Remark code (blank, _, <, or E)
P82668	numeric	EPTC, water, filtered (0.7 micron glass fiber filter), recoverable, micrograms per liter

Details

Source

Data provided by Patrick Phillips, U.S. Geological Survey, New York Water Science Center.

References

Oblinger Childress, C.J., Foreman, W.T., Connor, B.F., and Maloney, T.J., 1999, New reporting procedures based on long-term method detection levels and some considerations for interpretations of water-quality data provided by the U.S. Geological Survey Open-File Report 99–193, 19 p. [Also available at http://water.usgs.gov/owq/OFR_99-193/index.html.]

Examples

  data(swData)
  
  # summary of water-quality concentrations
  apply(qwMoRivOmaha[, grep("P[[:digit:]]", dimnames(qwMoRivOmaha)[[2]])], 
  2, summary)

  # scatter plot of Simazine concentrations
  cenScatPlot(qwMoRivOmaha, pname = "04035")
  
  # scatter plot with many additional plotting arguments
  par(las = 1, tcl = 0.5)
  cenScatPlot(qwMoRivOmaha, pname = "04035", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Simazine concentration, in micrograms per liter", 
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.1), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
            "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))
  
  # Prometon scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "04037", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Prometon concentration, in micrograms per liter", 
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.1), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), 
                     as.Date("2004-01-01")), xaxs = "i", xaxt = "n")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # Cyanazine scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "04041", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Cyanazine concentration, in micrograms per liter",
              legcex = 0.7, qwcols = c("R", "P"), 
              ylim = c(0.001, 5), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n", log = "y")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # Metolachlor scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "39415", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Metolachlor concentration, in micrograms per liter",
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0.001,5), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n", log = "y", legpos = "bottomleft")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # Alachlor scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "46342", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Alachlor concentration, in micrograms per liter", 
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.1), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))
  
  # Metribuzin scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "82630", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Metribuzin concentration, in micrograms per liter",
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.1), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # Trifluralin scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "82661", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Trifluralin concentration, in micrograms per liter",
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.03), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n")
  axdates<-c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # EPTC scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "82668", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "EPTC concentration, in micrograms per liter", 
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0.001, 1), yaxs = "i",
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n", log = "y")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels =c ("1996", "1998", "2000", "2002", "2004"))
data(swData)
  
  # summary of water-quality concentrations
  apply(qwMoRivOmaha[, grep("P[[:digit:]]", dimnames(qwMoRivOmaha)[[2]])], 
  2, summary)

  # scatter plot of Simazine concentrations
  cenScatPlot(qwMoRivOmaha, pname = "04035")
  
  # scatter plot with many additional plotting arguments
  par(las = 1, tcl = 0.5)
  cenScatPlot(qwMoRivOmaha, pname = "04035", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Simazine concentration, in micrograms per liter", 
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.1), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
            "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))
  
  # Prometon scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "04037", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Prometon concentration, in micrograms per liter", 
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.1), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), 
                     as.Date("2004-01-01")), xaxs = "i", xaxt = "n")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # Cyanazine scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "04041", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Cyanazine concentration, in micrograms per liter",
              legcex = 0.7, qwcols = c("R", "P"), 
              ylim = c(0.001, 5), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n", log = "y")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # Metolachlor scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "39415", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Metolachlor concentration, in micrograms per liter",
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0.001,5), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n", log = "y", legpos = "bottomleft")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # Alachlor scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "46342", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Alachlor concentration, in micrograms per liter", 
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.1), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))
  
  # Metribuzin scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "82630", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Metribuzin concentration, in micrograms per liter",
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.1), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # Trifluralin scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "82661", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "Trifluralin concentration, in micrograms per liter",
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0, 0.03), yaxs = "i", 
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n")
  axdates<-c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels = c("1996", "1998", "2000", "2002", "2004"))

  # EPTC scatter plot
  cenScatPlot(qwMoRivOmaha, pname = "82668", 
              site = "06610000 Missouri River at Omaha, Nebr.",
              ylabel = "EPTC concentration, in micrograms per liter", 
              legcex = 0.7, qwcols = c("R", "P"),
              ylim = c(0.001, 1), yaxs = "i",
              xlim = c(as.Date("1996-01-01"), as.Date("2004-01-01")), 
              xaxs = "i", xaxt = "n", log = "y")
  axdates <- c("1996-01-01", "1998-01-01", "2000-01-01", "2002-01-01", 
             "2004-01-01")
  axis(1, as.Date(axdates), labels =c ("1996", "1998", "2000", "2002", "2004"))

Internal function that generates plots of data and model results.

Description

seawaveQPlots is usually called from within fitMod but can be invoked directly. It generates plots of data and model results, as well as diagnostic plots, and returns the observed and predicted concentrations so that users may plot the concentrations using their own functions.

Usage

seawaveQPlots(
  stpars,
  cmaxt,
  tseas,
  tseaspr,
  tndlin,
  tndlinpr,
  cdatsub,
  cavdat,
  cavmat,
  clog,
  centmp,
  yrstart,
  yrend,
  tyr,
  tyrpr,
  pnames,
  tanm,
  mclass = 1,
  plotfile = FALSE
)
seawaveQPlots(
  stpars,
  cmaxt,
  tseas,
  tseaspr,
  tndlin,
  tndlinpr,
  cdatsub,
  cavdat,
  cavmat,
  clog,
  centmp,
  yrstart,
  yrend,
  tyr,
  tyrpr,
  pnames,
  tanm,
  mclass = 1,
  plotfile = FALSE
)

Arguments

`stpars`	is a matrix of information about the best seawaveQ model for the concentration data, see `examplestpars`.
`cmaxt`	is the decimal season of maximum chemical concentration.
`tseas`	is the decimal season of each concentration value in cdatsub.
`tseaspr`	is the decimal season date used to model concentration using the continuous data set cavdat.
`tndlin`	is the decimal time centered on the midpoint of the trend for the sample data, cdatasub.
`tndlinpr`	is the decimal time centered on the midpoint of the trend for the continuous data, cavdat.
`cdatsub`	is the concentration data.
`cavdat`	is the continuous (daily) ancillary data.
`cavmat`	is a matrix containing the continuous ancillary variables.
`clog`	is a vector of the base-10 logarithms of the concentration data.
`centmp`	is a logical vector indicating which concentration values are censored.
`yrstart`	is the starting year of the analysis (treated as January 1 of that year).
`yrend`	is the ending year of the analysis (treated as December 31 of that year).
`tyr`	is a vector of decimal dates for the concentration data.
`tyrpr`	is a vector of decimal dates for the continuous ancillary variables.
`pnames`	is the parameter (water-quality constituents) to analyze (if using USGS parameters, omit the starting 'P', such as "00945" for sulfate).
`tanm`	is a character identifier that names the trend analysis run. It is used to label output files.
`mclass`	has not been implemented yet, but will provide additional model options.
`plotfile`	is by default FALSE. True will write pdf files of plots to the user's file system.

Value

A PDF file containing plots of the data and modeled concentrations and regression diagnostic plots and a list containing the observed concentrations (censored and uncensored) and the predicted concentrations used for the plot.

Note

The plotting position used for representing censored values in the plots produced by seawaveQPlots is an important consideration for interpreting model fit. Plotting values obtained by using the censoring limit, or something smaller such as one-half of the censoring limit, produce plots that are difficult to interpret if there are a large number of censored values. Therefore, to make the plots more representative of diagnostic plots used for standard (non-censored) regression, a method for substituting randomized residuals in place of censored residuals was used. If a log-transformed concentration is censored at a particular limit, logC < L, then the residual for that concentration is censored as well, logC - fitted(logC) < L - fitted(logC) = rescen. In that case, a randomized residual was generated from a conditional normal distribution

resran <- scl * qnorm(runif(1) * pnorm(rescen / scl)),

where scl is the scale parameter from the survival regression model, pnorm is the R function for computing cumulative normal probabilities, runif is the R function for generating a random variable from the uniform distribution, and qnorm is the R function for computing quantiles of the normal distribution. Under the assumption that the model residuals are uncorrelated, normally distributed random variables with mean zero and standard deviation scl, the randomized residuals generated in this manner are an unbiased sample of the true (but unknown) residuals for the censored data. This is an application of the probability integral transform (Mood and others, 1974) to generate random variables from continuous distributions. The plotting position used a censored concentration is fitted(logC) + resran. Note that each time a new model fit is performed, a new set of randomized residuals is generated and thus the plotting positions for censored values can change.

Author(s)

Aldo V. Vecchia and Karen R. Ryberg

References

Mood, A.M., Graybill, F.A., and Boes, D.C., 1974, Introduction to the theory of statistics (3rd ed.): New York, McGraw-Hill, Inc., 564 p.

Examples

data(swData)
myPlots <- seawaveQPlots(stpars=examplestpars, cmaxt=0.4808743, 
tseas=exampletseas, tseaspr=exampletseaspr, tndlin=exampletndlin,
tndlinpr=exampletndlinpr, cdatsub=examplecdatsub, cavdat=examplecavdat, 
cavmat=examplecavmat, clog=exampleclog, centmp=examplecentmp, 
yrstart=1995, yrend=2003, tyr=exampletyr, tyrpr=exampletyrpr, 
pnames=c("04041"), tanm="examplePlots04041", plotfile = FALSE)
data(swData)
myPlots <- seawaveQPlots(stpars=examplestpars, cmaxt=0.4808743, 
tseas=exampletseas, tseaspr=exampletseaspr, tndlin=exampletndlin,
tndlinpr=exampletndlinpr, cdatsub=examplecdatsub, cavdat=examplecavdat, 
cavmat=examplecavmat, clog=exampleclog, centmp=examplecentmp, 
yrstart=1995, yrend=2003, tyr=exampletyr, tyrpr=exampletyrpr, 
pnames=c("04041"), tanm="examplePlots04041", plotfile = FALSE)

Internal function that generates plots of data and model results.

Description

seawaveQPlots2 is usually called from within fitMod but can be invoked directly. It generates plots of data and model results, as well as diagnostic plots, and returns the observed and predicted concentrations so that users may plot the concentrations using their own functions. This is the version for models that use restricted cubic splines.

Usage

seawaveQPlots2(
  stpars,
  cmaxt,
  tseas,
  tseaspr,
  tndrcs,
  tndrcspr,
  cdatsub,
  cavdat,
  cavmat,
  clog,
  centmp,
  yrstart,
  yrend,
  tyr,
  tyrpr,
  pnames,
  tanm,
  mclass = 2,
  numk,
  plotfile = FALSE
)
seawaveQPlots2(
  stpars,
  cmaxt,
  tseas,
  tseaspr,
  tndrcs,
  tndrcspr,
  cdatsub,
  cavdat,
  cavmat,
  clog,
  centmp,
  yrstart,
  yrend,
  tyr,
  tyrpr,
  pnames,
  tanm,
  mclass = 2,
  numk,
  plotfile = FALSE
)

Arguments

`stpars`	is a matrix of information about the best seawaveQ model for the concentration data, see `examplestpars`.
`cmaxt`	is the decimal season of maximum chemical concentration.
`tseas`	is the decimal season of each concentration value in cdatsub.
`tseaspr`	is the decimal season date used to model concentration using the continuous data set cavdat.
`tndrcs`	is the decimal time centered on the midpoint of the trend for the sample data, cdatasub, then converted to a linear tail-restricted cubic spline with a particular number of knots (Harrell, 2010, 2018).
`tndrcspr`	is the decimal time centered on the midpoint of the trend for the continuous data, cavdat, then converted to a linear tail-restricted cubic spline using the knots from tndrcs.
`cdatsub`	is the concentration data.
`cavdat`	is the continuous (daily) ancillary data.
`cavmat`	is a matrix containing the continuous ancillary variables.
`clog`	is a vector of the base-10 logarithms of the concentration data.
`centmp`	is a logical vector indicating which concentration values are censored.
`yrstart`	is the starting year of the analysis (treated as January 1 of that year).
`yrend`	is the ending year of the analysis (treated as December 31 of that year).
`tyr`	is a vector of decimal dates for the concentration data.
`tyrpr`	is a vector of decimal dates for the continuous ancillary variables.
`pnames`	is the parameter (water-quality constituents) to analyze (if using USGS parameters, omit the starting 'P', such as "00945" for sulfate).
`tanm`	is a character identifier that names the trend analysis run. It is used to label output files.
`mclass`	indicates the class of model to use. A class 1 model is the traditional SEAWAVE-Q model that has a linear time trend. A class 2 model is a newer option for longer trend periods that uses a set of restricted cubic splines on the time variable to provide a more flexible model.
`numk`	is the number of knots in the restricted cubic spline model. The default is 4, and the recommended number is 3–7.
`plotfile`	is by default FALSE. True will write pdf files of plots to the user's file system.

Value

A PDF file (if plotfile is TRUE) containing plots of the data and modeled concentrations and regression diagnostic plots and a list containing the observed concentrations (censored and uncensored) and the predicted concentrations used for the plot.

Note

The plotting position used for representing censored values in the plots produced by seawaveQPlots2 is an important consideration for interpreting model fit. Plotting values obtained by using the censoring limit, or something smaller such as one-half of the censoring limit, produce plots that are difficult to interpret if there are a large number of censored values. Therefore, to make the plots more representative of diagnostic plots used for standard (non-censored) regression, a method for substituting randomized residuals in place of censored residuals was used. If a log-transformed concentration is censored at a particular limit, logC < L, then the residual for that concentration is censored as well, logC - fitted(logC) < L - fitted(logC) = rescen. In that case, a randomized residual was generated from a conditional normal distribution

resran <- scl * qnorm(runif(1) * pnorm(rescen / scl)),

where scl is the scale parameter from the survival regression model, pnorm is the R function for computing cumulative normal probabilities, runif is the R function for generating a random variable from the uniform distribution, and qnorm is the R function for computing quantiles of the normal distribution. Under the assumption that the model residuals are uncorrelated, normally distributed random variables with mean zero and standard deviation scl, the randomized residuals generated in this manner are an unbiased sample of the true (but unknown) residuals for the censored data. This is an application of the probability integral transform (Mood and others, 1974) to generate random variables from continuous distributions. The plotting position used for a censored concentration is fitted(logC) + resran. Note that each time a new model fit is performed, a new set of randomized residuals is generated and thus the plotting positions for censored values can change.

Author(s)

Aldo V. Vecchia and Karen R. Ryberg

References

Harrell, F.E., Jr., 2010, Regression modeling strategies—With applications to linear models, logistic regression, and survival analysis: New York, Springer-Verlag, 568 p.

Harrell, F.E., Jr., 2018, rms—Regression modeling strategies: R package version 5.1-2, https://CRAN.R-project.org/package=rms.

Mood, A.M., Graybill, F.A., and Boes, D.C., 1974, Introduction to the theory of statistics (3d ed.): New York, McGraw-Hill, Inc., 564 p.

Package 'seawaveQ'

Help Index

A model and utilities for analyzing trends in chemical concentrations in streams with a seasonal wave (seawave) and adjustment for streamflow (Q) and other ancillary variables (Q) and other ancillary variables

Description

Details

Author(s)

References

Scatter plot of water-quality data

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Combine water-quality sample data and continuous ancillary variables

Description

Usage

Arguments

Format

Value

Note

Author(s)

Examples

Seasonal Wave Computation

Description

Usage

Arguments

Value

Note

Author(s)

References

Examples

Continuously monitored (daily) data for 06610000 Missouri River at Omaha, Neb.

Description

Usage

Format

Details

Note

Source

References

Examples

Example continuous ancillary variable data.

Description

Usage

Format

Source

References

Examples

Example continuous ancillary variable matrix.

Description

Usage

Format

Source

Examples

Example water-quality data.

Description

Usage

Format

Source

See Also

Examples

Example logical vector.

Description

Usage

Format

Source

Examples

Example of logarithmically transformed concentration data.

Description

Usage

Format

Source

Examples

Example data indicators.

Description

Usage

Format

Source