Calculating summary statistics describing historical flow conditions

Uses modelled flow data to calculate percentile statistics under a chosen scenario (e.g. historical or recent actual) as a ratio of a reference scenario (usually naturalised flows), for one or more sites.

Usage

calc_rfrstats(data = NULL,
              site_col = NULL,
              date_col = NULL,
              flow_col = NULL,
              ref_col = NULL,
              q = NULL,
              save_as = FALSE,
              save_dir = getwd()
              )

Arguments

data: Name of dataframe or tibble containing the flow data to be processed. Must be in long format (i.e. separate columns for site_id, date, and each flow scenario).
site_col: Name of column in data containing unique flow site id.
date_col: Name of column in data containing date of flow record in the format of yyyy/mm/dd. See examples for a simple example to reformating the date if needed.
flow_col: Name of column in data containing flow values for the scenario of interest (e.g. historical, recent actual).
ref_col: Name of column in data containing flows under the reference (e.g. naturalised) scenario.
q: Required percentile statistic (between 1 and 99).
save: Specifies if results should be saved as rds file (for future use); Default = FALSE.
save_dir: Path to folder where results are to be saved; Default = Current working directory.

Value

A tibble containing the processed flow statistics for every combination of site and water year.

Details

For each water year (e.g. 1 October one year to 30 September the next) at each site, the chosen flow percentile (q) is calculated under the scenario of interest (qx) and the reference scenario (qx_ref), where qx takes its name depending on q (e.g. Q95). The residual flow ratio (RFR) is then calculated as: rfrx = qx / qx_ref.

A q value of 95 represents the flow that is exceeded 95% of the time (i.e. 5th percentile).

In addition, the number of data points in each water year is calculated for both the scenario of interest (n) and the reference scenario (n_ref).

Modelled flows can be on any time step (e.g. 1, 10, 30 days, or monthly) but should be at approximately regular time intervals and the same for the both scenarios. The function does not require a minimum number of records in each water year, but more extreme percentiles calculated using sparse data may not be meaningful. Any missing records (NAs) are ignored when calculating summary statistics. The user should check n and n_ref to identify any water years with incomplete data coverage.

Examples


# Example 1
# load site model flow data
site.model.flow <- data(site.model.flow, package="hetoolkit")


# calc_rfrstats(data = site.model.flow,
#              site_col = "SITE_ID",
#              date_col = "Date_end",
#              flow_col = "Flow_HIST",
#              ref_col = "Flow_NAT",
#              q = 75,
#              save = FALSE,
#              save_dir = getwd())