Linking biology samples with time-varying flow statistics for paired biology and flow sites.
join_he.Rd
This function joins biology sample data with time-varying flow statistics for one or more antecedent (lagged) time periods (as calculated by the calc_flowstats
function) to create a combined dataset for hydro-ecological modelling.
Usage
join_he(biol_data, flow_stats, mapping = NULL, method = "A" , lags = 0, join_type = "add_flows")
Arguments
- biol_data
Data frame or tibble containing the processed biology data. Must contain the following columns: biol_site_id and date (in date format).
- flow_stats
Data frame or tibble containing the calculated time-varying flow statistics, by site and time period and win_no (as produced by the
calc_flowstats
function ). Must contain the following columns:flow_site_id
,start_date
andend_date
. The function joins all the variables inflow_stats
, so it is advisable to manually drop any flow statistics which are not of interest before applying the function.- mapping
Data frame or tibble containing paired biology sites IDs and flow site IDs. Must contain columns named biol_site_id and flow_site_id. These columns must not contain any NAs. Default =
NULL
, which assumes that paired biology and flow sites have identical ids, so mapping is not required.- method
Choice of method for linking biology samples to flow statistics for antecedent time periods. Using method = "A" (default), lag 0 is defined for each biology sample as the most recently finished flow time period; using method = "B", lag 0 is defined as the most recently started flow time period. See below for details.
- lags
Vector of lagged flow time periods of interest. Values must be zero or positive, with larger values representing longer time lags (i.e. an increasing time gap between the flow time period and the biology sample date). Default = 0. See below for details.
- join_type
To add flow statistics to each biology sample, choose "add_flows" (default); this produces a dataset of biology metrics (response variables) and flow statistics (predictor variables) for hydro-ecological modelling. To add biology sample data to flow statistics for each time period, choose "add_biol"; this produces a time series of flow statistics with associated biological metrics which can be used, for example, to assess the coverage of historical flow conditions using the
plot_rngflows
function.
Details
biol_data
and flow_stats
may contain more sites than listed in mapping
, but any sites not listed in mapping
will be filtered out. If mapping = NULL
, then biology site and flow sites with matching ids will be paired automatically.
The calc_flowstats
function uses a moving window approach to calculate a time-varying flow statistics for a sequence of time periods which can be either: (i) contiguous (i.e. each time period is followed immediately by the next one), (ii) non-contiguous (i.e. there is a gap between one time period at the next), or (iii) over-lapping (i.e. the next time period stats before the previous one has finished).
To describe the antecedent flow conditions prior to each biology sample, the time periods are labelled relative to the date of the biology sample, with lag 0 representing either the most recently finished (method = "A"
) or most recently started (method = "B"
) flow time period. The time period immediately prior to the Lag 0 time period is the Lag 1 period, and the period immediately prior to that is the Lag 2 period, and so on.
As an example, suppose we have a biology sample dated 15 September 2020 and that flow statistics are available for a sequence of contiguous 1 month periods (each one a calendar month). Using method = "A"
, the Lag 0 period for that biology sample would be August 2020 (the most recently finished time period), the Lag 1 period would be July 2020, the Lag 2 period would be June 2020, and so on. Similarly, using method = "B"
, the Lag 0 period for that biology sample would be September 2020 (the most recently started time period), the Lag 1 period would be August 2020, the Lag 2 period would be July 2020, and so on.
As a second example, suppose we again have a biology sample dated 15 September 2020 and that flow statistics are available for a sequence of overlapping 6 month periods (i.e. February to July 2020, March to August 2020, April to September 2020, and so on). Using method = "A"
, the Lag 0 period for that biology sample would be March to August 2020 (the most recently finished time period), the Lag 1 period would be February to July 2020, the Lag 2 period would be January to June 2020, and so on. Similarly, using method ="B"
, the Lag 0 period for that biology sample would be September 2000 to February 2021 (the most recently started time period), the Lag 1 period would be 1 August 2000 to January 2021, the Lag 2 period would be July to December 2020, and so on.
Note that if using join_type = "add_biol"
, a flow period becomes replicated if it has 2+ biology samples within it. To avoid this happening, summarise (e.g. average) the replicate biology samples within each time window before applying join_he
. See below for an example.
Examples
# create flow stats from synthetic flow data
set.seed(123)
flow_data <- data.frame(flow_site_id = rep("A0001", 365),
date = seq(as.Date("2021-01-01"), as.Date("2021-12-31"), by = "1 day"),
flow = rnorm(365, 10, 2))
flow_stats <- calc_flowstats(data = flow_data,
site_col = "flow_site_id",
date_col = "date",
flow_col = "flow",
win_start = "2021-01-01",
win_width = "1 month",
win_step = "1 month")[[1]] %>%
dplyr::select(flow_site_id, win_no, start_date, end_date, Q95z)
#> Joining with `by = join_by(site, win_no, n_data)`
# create synthetic biology data
biol_data <- data.frame(biol_site_id = rep("A0001", 2),
date = as.Date(c("2021-04-15", "2021-09-15")),
metric = c(0.8, 0.7))
# view data
flow_stats; biol_data
#> flow_site_id win_no start_date end_date Q95z
#> 1 A0001 1 2021-01-01 2021-01-31 -0.31871603
#> 2 A0001 2 2021-02-01 2021-02-28 0.70614692
#> 3 A0001 3 2021-03-01 2021-03-31 0.97917882
#> 4 A0001 4 2021-04-01 2021-04-30 0.17013587
#> 5 A0001 5 2021-05-01 2021-05-31 -0.75563412
#> 6 A0001 6 2021-06-01 2021-06-30 1.23495888
#> 7 A0001 7 2021-07-01 2021-07-31 0.69520028
#> 8 A0001 8 2021-08-01 2021-08-31 0.52935726
#> 9 A0001 9 2021-09-01 2021-09-30 -0.23660759
#> 10 A0001 10 2021-10-01 2021-10-31 0.08601627
#> 11 A0001 11 2021-11-01 2021-11-30 -0.61904164
#> 12 A0001 12 2021-12-01 2021-12-31 -2.47099491
#> 13 A0001 13 2022-01-01 2022-01-31 NA
#> 14 A0001 14 2022-02-01 2022-02-28 NA
#> 15 A0001 15 2022-03-01 2022-03-31 NA
#> 16 A0001 16 2022-04-01 2022-04-30 NA
#> 17 A0001 17 2022-05-01 2022-05-31 NA
#> 18 A0001 18 2022-06-01 2022-06-30 NA
#> 19 A0001 19 2022-07-01 2022-07-31 NA
#> 20 A0001 20 2022-08-01 2022-08-31 NA
#> 21 A0001 21 2022-09-01 2022-09-30 NA
#> 22 A0001 22 2022-10-01 2022-10-31 NA
#> 23 A0001 23 2022-11-01 2022-11-30 NA
#> 24 A0001 24 2022-12-01 2022-12-31 NA
#> 25 A0001 25 2023-01-01 2023-01-31 NA
#> 26 A0001 26 2023-02-01 2023-02-28 NA
#> 27 A0001 27 2023-03-01 2023-03-31 NA
#> 28 A0001 28 2023-04-01 2023-04-30 NA
#> 29 A0001 29 2023-05-01 2023-05-31 NA
#> 30 A0001 30 2023-06-01 2023-06-30 NA
#> 31 A0001 31 2023-07-01 2023-07-31 NA
#> 32 A0001 32 2023-08-01 2023-08-31 NA
#> 33 A0001 33 2023-09-01 2023-09-30 NA
#> 34 A0001 34 2023-10-01 2023-10-31 NA
#> 35 A0001 35 2023-11-01 2023-11-30 NA
#> 36 A0001 36 2023-12-01 2023-12-31 NA
#> 37 A0001 37 2024-01-01 2024-01-31 NA
#> 38 A0001 38 2024-02-01 2024-02-29 NA
#> 39 A0001 39 2024-03-01 2024-03-31 NA
#> 40 A0001 40 2024-04-01 2024-04-30 NA
#> 41 A0001 41 2024-05-01 2024-05-31 NA
#> 42 A0001 42 2024-06-01 2024-06-30 NA
#> 43 A0001 43 2024-07-01 2024-07-31 NA
#> 44 A0001 44 2024-08-01 2024-08-31 NA
#> 45 A0001 45 2024-09-01 2024-09-30 NA
#> 46 A0001 46 2024-10-01 2024-10-31 NA
#> 47 A0001 47 2024-11-01 2024-11-30 NA
#> 48 A0001 48 2024-12-01 2024-12-31 NA
#> 49 A0001 49 2025-01-01 2025-01-31 NA
#> biol_site_id date metric
#> 1 A0001 2021-04-15 0.8
#> 2 A0001 2021-09-15 0.7
# add flow statistics to each biology sample using method A
# mapping = NULL because biology and flow sites have identical ids
join_he(biol_data = biol_data,
flow_stats = flow_stats,
mapping = NULL,
method = "A",
lags = c(0,1),
join_type = "add_flows")
#> biol_site_id date metric flow_site_id win_no start_date end_date
#> 1 A0001 2021-04-15 0.8 A0001 3 2021-03-01 2021-03-31
#> 2 A0001 2021-09-15 0.7 A0001 8 2021-08-01 2021-08-31
#> win_no_lag0 Q95z_lag0 win_no_lag1 Q95z_lag1
#> 1 3 0.9791788 2 0.7061469
#> 2 8 0.5293573 7 0.6952003
# add flow statistics to each biology sample using method B
# mapping = NULL because biology and flow sites have identical ids
join_he(biol_data = biol_data,
flow_stats = flow_stats,
mapping = NULL,
method = "B",
lags = c(0,1),
join_type = "add_flows")
#> biol_site_id date metric flow_site_id win_no start_date end_date
#> 1 A0001 2021-04-15 0.8 A0001 4 2021-04-01 2021-04-30
#> 2 A0001 2021-09-15 0.7 A0001 9 2021-09-01 2021-09-30
#> win_no_lag0 Q95z_lag0 win_no_lag1 Q95z_lag1
#> 1 4 0.1701359 3 0.9791788
#> 2 9 -0.2366076 8 0.5293573
# add biology sample data to flow statistics for each time period using method A
join_he(biol_data = biol_data,
flow_stats = flow_stats,
mapping = NULL,
method = "A",
lags = c(0,1),
join_type = "add_biol")
#> flow_site_id win_no start_date end_date win_no_lag0 Q95z_lag0
#> 1 A0001 1 2021-01-01 2021-01-31 1 -0.31871603
#> 2 A0001 2 2021-02-01 2021-02-28 2 0.70614692
#> 3 A0001 3 2021-03-01 2021-03-31 3 0.97917882
#> 4 A0001 4 2021-04-01 2021-04-30 4 0.17013587
#> 5 A0001 5 2021-05-01 2021-05-31 5 -0.75563412
#> 6 A0001 6 2021-06-01 2021-06-30 6 1.23495888
#> 7 A0001 7 2021-07-01 2021-07-31 7 0.69520028
#> 8 A0001 8 2021-08-01 2021-08-31 8 0.52935726
#> 9 A0001 9 2021-09-01 2021-09-30 9 -0.23660759
#> 10 A0001 10 2021-10-01 2021-10-31 10 0.08601627
#> 11 A0001 11 2021-11-01 2021-11-30 11 -0.61904164
#> 12 A0001 12 2021-12-01 2021-12-31 12 -2.47099491
#> 13 A0001 13 2022-01-01 2022-01-31 13 NA
#> 14 A0001 14 2022-02-01 2022-02-28 14 NA
#> 15 A0001 15 2022-03-01 2022-03-31 15 NA
#> 16 A0001 16 2022-04-01 2022-04-30 16 NA
#> 17 A0001 17 2022-05-01 2022-05-31 17 NA
#> 18 A0001 18 2022-06-01 2022-06-30 18 NA
#> 19 A0001 19 2022-07-01 2022-07-31 19 NA
#> 20 A0001 20 2022-08-01 2022-08-31 20 NA
#> 21 A0001 21 2022-09-01 2022-09-30 21 NA
#> 22 A0001 22 2022-10-01 2022-10-31 22 NA
#> 23 A0001 23 2022-11-01 2022-11-30 23 NA
#> 24 A0001 24 2022-12-01 2022-12-31 24 NA
#> 25 A0001 25 2023-01-01 2023-01-31 25 NA
#> 26 A0001 26 2023-02-01 2023-02-28 26 NA
#> 27 A0001 27 2023-03-01 2023-03-31 27 NA
#> 28 A0001 28 2023-04-01 2023-04-30 28 NA
#> 29 A0001 29 2023-05-01 2023-05-31 29 NA
#> 30 A0001 30 2023-06-01 2023-06-30 30 NA
#> 31 A0001 31 2023-07-01 2023-07-31 31 NA
#> 32 A0001 32 2023-08-01 2023-08-31 32 NA
#> 33 A0001 33 2023-09-01 2023-09-30 33 NA
#> 34 A0001 34 2023-10-01 2023-10-31 34 NA
#> 35 A0001 35 2023-11-01 2023-11-30 35 NA
#> 36 A0001 36 2023-12-01 2023-12-31 36 NA
#> 37 A0001 37 2024-01-01 2024-01-31 37 NA
#> 38 A0001 38 2024-02-01 2024-02-29 38 NA
#> 39 A0001 39 2024-03-01 2024-03-31 39 NA
#> 40 A0001 40 2024-04-01 2024-04-30 40 NA
#> 41 A0001 41 2024-05-01 2024-05-31 41 NA
#> 42 A0001 42 2024-06-01 2024-06-30 42 NA
#> 43 A0001 43 2024-07-01 2024-07-31 43 NA
#> 44 A0001 44 2024-08-01 2024-08-31 44 NA
#> 45 A0001 45 2024-09-01 2024-09-30 45 NA
#> 46 A0001 46 2024-10-01 2024-10-31 46 NA
#> 47 A0001 47 2024-11-01 2024-11-30 47 NA
#> 48 A0001 48 2024-12-01 2024-12-31 48 NA
#> 49 A0001 49 2025-01-01 2025-01-31 49 NA
#> win_no_lag1 Q95z_lag1 biol_site_id date metric
#> 1 0 NA A0001 <NA> NA
#> 2 1 -0.31871603 A0001 <NA> NA
#> 3 2 0.70614692 A0001 2021-04-15 0.8
#> 4 3 0.97917882 A0001 <NA> NA
#> 5 4 0.17013587 A0001 <NA> NA
#> 6 5 -0.75563412 A0001 <NA> NA
#> 7 6 1.23495888 A0001 <NA> NA
#> 8 7 0.69520028 A0001 2021-09-15 0.7
#> 9 8 0.52935726 A0001 <NA> NA
#> 10 9 -0.23660759 A0001 <NA> NA
#> 11 10 0.08601627 A0001 <NA> NA
#> 12 11 -0.61904164 A0001 <NA> NA
#> 13 12 -2.47099491 A0001 <NA> NA
#> 14 13 NA A0001 <NA> NA
#> 15 14 NA A0001 <NA> NA
#> 16 15 NA A0001 <NA> NA
#> 17 16 NA A0001 <NA> NA
#> 18 17 NA A0001 <NA> NA
#> 19 18 NA A0001 <NA> NA
#> 20 19 NA A0001 <NA> NA
#> 21 20 NA A0001 <NA> NA
#> 22 21 NA A0001 <NA> NA
#> 23 22 NA A0001 <NA> NA
#> 24 23 NA A0001 <NA> NA
#> 25 24 NA A0001 <NA> NA
#> 26 25 NA A0001 <NA> NA
#> 27 26 NA A0001 <NA> NA
#> 28 27 NA A0001 <NA> NA
#> 29 28 NA A0001 <NA> NA
#> 30 29 NA A0001 <NA> NA
#> 31 30 NA A0001 <NA> NA
#> 32 31 NA A0001 <NA> NA
#> 33 32 NA A0001 <NA> NA
#> 34 33 NA A0001 <NA> NA
#> 35 34 NA A0001 <NA> NA
#> 36 35 NA A0001 <NA> NA
#> 37 36 NA A0001 <NA> NA
#> 38 37 NA A0001 <NA> NA
#> 39 38 NA A0001 <NA> NA
#> 40 39 NA A0001 <NA> NA
#> 41 40 NA A0001 <NA> NA
#> 42 41 NA A0001 <NA> NA
#> 43 42 NA A0001 <NA> NA
#> 44 43 NA A0001 <NA> NA
#> 45 44 NA A0001 <NA> NA
#> 46 45 NA A0001 <NA> NA
#> 47 46 NA A0001 <NA> NA
#> 48 47 NA A0001 <NA> NA
#> 49 48 NA A0001 <NA> NA
# add biology sample data to flow statistics for each time period using method B
join_he(biol_data = biol_data,
flow_stats = flow_stats,
mapping = NULL,
method = "B",
lags = c(0,1),
join_type = "add_biol")
#> flow_site_id win_no start_date end_date win_no_lag0 Q95z_lag0
#> 1 A0001 1 2021-01-01 2021-01-31 1 -0.31871603
#> 2 A0001 2 2021-02-01 2021-02-28 2 0.70614692
#> 3 A0001 3 2021-03-01 2021-03-31 3 0.97917882
#> 4 A0001 4 2021-04-01 2021-04-30 4 0.17013587
#> 5 A0001 5 2021-05-01 2021-05-31 5 -0.75563412
#> 6 A0001 6 2021-06-01 2021-06-30 6 1.23495888
#> 7 A0001 7 2021-07-01 2021-07-31 7 0.69520028
#> 8 A0001 8 2021-08-01 2021-08-31 8 0.52935726
#> 9 A0001 9 2021-09-01 2021-09-30 9 -0.23660759
#> 10 A0001 10 2021-10-01 2021-10-31 10 0.08601627
#> 11 A0001 11 2021-11-01 2021-11-30 11 -0.61904164
#> 12 A0001 12 2021-12-01 2021-12-31 12 -2.47099491
#> 13 A0001 13 2022-01-01 2022-01-31 13 NA
#> 14 A0001 14 2022-02-01 2022-02-28 14 NA
#> 15 A0001 15 2022-03-01 2022-03-31 15 NA
#> 16 A0001 16 2022-04-01 2022-04-30 16 NA
#> 17 A0001 17 2022-05-01 2022-05-31 17 NA
#> 18 A0001 18 2022-06-01 2022-06-30 18 NA
#> 19 A0001 19 2022-07-01 2022-07-31 19 NA
#> 20 A0001 20 2022-08-01 2022-08-31 20 NA
#> 21 A0001 21 2022-09-01 2022-09-30 21 NA
#> 22 A0001 22 2022-10-01 2022-10-31 22 NA
#> 23 A0001 23 2022-11-01 2022-11-30 23 NA
#> 24 A0001 24 2022-12-01 2022-12-31 24 NA
#> 25 A0001 25 2023-01-01 2023-01-31 25 NA
#> 26 A0001 26 2023-02-01 2023-02-28 26 NA
#> 27 A0001 27 2023-03-01 2023-03-31 27 NA
#> 28 A0001 28 2023-04-01 2023-04-30 28 NA
#> 29 A0001 29 2023-05-01 2023-05-31 29 NA
#> 30 A0001 30 2023-06-01 2023-06-30 30 NA
#> 31 A0001 31 2023-07-01 2023-07-31 31 NA
#> 32 A0001 32 2023-08-01 2023-08-31 32 NA
#> 33 A0001 33 2023-09-01 2023-09-30 33 NA
#> 34 A0001 34 2023-10-01 2023-10-31 34 NA
#> 35 A0001 35 2023-11-01 2023-11-30 35 NA
#> 36 A0001 36 2023-12-01 2023-12-31 36 NA
#> 37 A0001 37 2024-01-01 2024-01-31 37 NA
#> 38 A0001 38 2024-02-01 2024-02-29 38 NA
#> 39 A0001 39 2024-03-01 2024-03-31 39 NA
#> 40 A0001 40 2024-04-01 2024-04-30 40 NA
#> 41 A0001 41 2024-05-01 2024-05-31 41 NA
#> 42 A0001 42 2024-06-01 2024-06-30 42 NA
#> 43 A0001 43 2024-07-01 2024-07-31 43 NA
#> 44 A0001 44 2024-08-01 2024-08-31 44 NA
#> 45 A0001 45 2024-09-01 2024-09-30 45 NA
#> 46 A0001 46 2024-10-01 2024-10-31 46 NA
#> 47 A0001 47 2024-11-01 2024-11-30 47 NA
#> 48 A0001 48 2024-12-01 2024-12-31 48 NA
#> 49 A0001 49 2025-01-01 2025-01-31 49 NA
#> win_no_lag1 Q95z_lag1 biol_site_id date metric
#> 1 0 NA A0001 <NA> NA
#> 2 1 -0.31871603 A0001 <NA> NA
#> 3 2 0.70614692 A0001 <NA> NA
#> 4 3 0.97917882 A0001 2021-04-15 0.8
#> 5 4 0.17013587 A0001 <NA> NA
#> 6 5 -0.75563412 A0001 <NA> NA
#> 7 6 1.23495888 A0001 <NA> NA
#> 8 7 0.69520028 A0001 <NA> NA
#> 9 8 0.52935726 A0001 2021-09-15 0.7
#> 10 9 -0.23660759 A0001 <NA> NA
#> 11 10 0.08601627 A0001 <NA> NA
#> 12 11 -0.61904164 A0001 <NA> NA
#> 13 12 -2.47099491 A0001 <NA> NA
#> 14 13 NA A0001 <NA> NA
#> 15 14 NA A0001 <NA> NA
#> 16 15 NA A0001 <NA> NA
#> 17 16 NA A0001 <NA> NA
#> 18 17 NA A0001 <NA> NA
#> 19 18 NA A0001 <NA> NA
#> 20 19 NA A0001 <NA> NA
#> 21 20 NA A0001 <NA> NA
#> 22 21 NA A0001 <NA> NA
#> 23 22 NA A0001 <NA> NA
#> 24 23 NA A0001 <NA> NA
#> 25 24 NA A0001 <NA> NA
#> 26 25 NA A0001 <NA> NA
#> 27 26 NA A0001 <NA> NA
#> 28 27 NA A0001 <NA> NA
#> 29 28 NA A0001 <NA> NA
#> 30 29 NA A0001 <NA> NA
#> 31 30 NA A0001 <NA> NA
#> 32 31 NA A0001 <NA> NA
#> 33 32 NA A0001 <NA> NA
#> 34 33 NA A0001 <NA> NA
#> 35 34 NA A0001 <NA> NA
#> 36 35 NA A0001 <NA> NA
#> 37 36 NA A0001 <NA> NA
#> 38 37 NA A0001 <NA> NA
#> 39 38 NA A0001 <NA> NA
#> 40 39 NA A0001 <NA> NA
#> 41 40 NA A0001 <NA> NA
#> 42 41 NA A0001 <NA> NA
#> 43 42 NA A0001 <NA> NA
#> 44 43 NA A0001 <NA> NA
#> 45 44 NA A0001 <NA> NA
#> 46 45 NA A0001 <NA> NA
#> 47 46 NA A0001 <NA> NA
#> 48 47 NA A0001 <NA> NA
#> 49 48 NA A0001 <NA> NA
# using join_type = "add_biol", a flow period becomes replicated if it has 2+ biology samples
biol_data2 <- data.frame(biol_site_id = rep("A0001", 3),
date = as.Date(c("2021-04-15", "2021-09-15", "2021-09-17")),
metric = c(0.8, 0.7, 0.6))
join_he(biol_data = biol_data2,
flow_stats = flow_stats,
mapping = NULL,
method = "A",
lags = c(0,1),
join_type = "add_biol")
#> flow_site_id win_no start_date end_date win_no_lag0 Q95z_lag0
#> 1 A0001 1 2021-01-01 2021-01-31 1 -0.31871603
#> 2 A0001 2 2021-02-01 2021-02-28 2 0.70614692
#> 3 A0001 3 2021-03-01 2021-03-31 3 0.97917882
#> 4 A0001 4 2021-04-01 2021-04-30 4 0.17013587
#> 5 A0001 5 2021-05-01 2021-05-31 5 -0.75563412
#> 6 A0001 6 2021-06-01 2021-06-30 6 1.23495888
#> 7 A0001 7 2021-07-01 2021-07-31 7 0.69520028
#> 8 A0001 8 2021-08-01 2021-08-31 8 0.52935726
#> 9 A0001 8 2021-08-01 2021-08-31 8 0.52935726
#> 10 A0001 9 2021-09-01 2021-09-30 9 -0.23660759
#> 11 A0001 10 2021-10-01 2021-10-31 10 0.08601627
#> 12 A0001 11 2021-11-01 2021-11-30 11 -0.61904164
#> 13 A0001 12 2021-12-01 2021-12-31 12 -2.47099491
#> 14 A0001 13 2022-01-01 2022-01-31 13 NA
#> 15 A0001 14 2022-02-01 2022-02-28 14 NA
#> 16 A0001 15 2022-03-01 2022-03-31 15 NA
#> 17 A0001 16 2022-04-01 2022-04-30 16 NA
#> 18 A0001 17 2022-05-01 2022-05-31 17 NA
#> 19 A0001 18 2022-06-01 2022-06-30 18 NA
#> 20 A0001 19 2022-07-01 2022-07-31 19 NA
#> 21 A0001 20 2022-08-01 2022-08-31 20 NA
#> 22 A0001 21 2022-09-01 2022-09-30 21 NA
#> 23 A0001 22 2022-10-01 2022-10-31 22 NA
#> 24 A0001 23 2022-11-01 2022-11-30 23 NA
#> 25 A0001 24 2022-12-01 2022-12-31 24 NA
#> 26 A0001 25 2023-01-01 2023-01-31 25 NA
#> 27 A0001 26 2023-02-01 2023-02-28 26 NA
#> 28 A0001 27 2023-03-01 2023-03-31 27 NA
#> 29 A0001 28 2023-04-01 2023-04-30 28 NA
#> 30 A0001 29 2023-05-01 2023-05-31 29 NA
#> 31 A0001 30 2023-06-01 2023-06-30 30 NA
#> 32 A0001 31 2023-07-01 2023-07-31 31 NA
#> 33 A0001 32 2023-08-01 2023-08-31 32 NA
#> 34 A0001 33 2023-09-01 2023-09-30 33 NA
#> 35 A0001 34 2023-10-01 2023-10-31 34 NA
#> 36 A0001 35 2023-11-01 2023-11-30 35 NA
#> 37 A0001 36 2023-12-01 2023-12-31 36 NA
#> 38 A0001 37 2024-01-01 2024-01-31 37 NA
#> 39 A0001 38 2024-02-01 2024-02-29 38 NA
#> 40 A0001 39 2024-03-01 2024-03-31 39 NA
#> 41 A0001 40 2024-04-01 2024-04-30 40 NA
#> 42 A0001 41 2024-05-01 2024-05-31 41 NA
#> 43 A0001 42 2024-06-01 2024-06-30 42 NA
#> 44 A0001 43 2024-07-01 2024-07-31 43 NA
#> 45 A0001 44 2024-08-01 2024-08-31 44 NA
#> 46 A0001 45 2024-09-01 2024-09-30 45 NA
#> 47 A0001 46 2024-10-01 2024-10-31 46 NA
#> 48 A0001 47 2024-11-01 2024-11-30 47 NA
#> 49 A0001 48 2024-12-01 2024-12-31 48 NA
#> 50 A0001 49 2025-01-01 2025-01-31 49 NA
#> win_no_lag1 Q95z_lag1 biol_site_id date metric
#> 1 0 NA A0001 <NA> NA
#> 2 1 -0.31871603 A0001 <NA> NA
#> 3 2 0.70614692 A0001 2021-04-15 0.8
#> 4 3 0.97917882 A0001 <NA> NA
#> 5 4 0.17013587 A0001 <NA> NA
#> 6 5 -0.75563412 A0001 <NA> NA
#> 7 6 1.23495888 A0001 <NA> NA
#> 8 7 0.69520028 A0001 2021-09-15 0.7
#> 9 7 0.69520028 A0001 2021-09-17 0.6
#> 10 8 0.52935726 A0001 <NA> NA
#> 11 9 -0.23660759 A0001 <NA> NA
#> 12 10 0.08601627 A0001 <NA> NA
#> 13 11 -0.61904164 A0001 <NA> NA
#> 14 12 -2.47099491 A0001 <NA> NA
#> 15 13 NA A0001 <NA> NA
#> 16 14 NA A0001 <NA> NA
#> 17 15 NA A0001 <NA> NA
#> 18 16 NA A0001 <NA> NA
#> 19 17 NA A0001 <NA> NA
#> 20 18 NA A0001 <NA> NA
#> 21 19 NA A0001 <NA> NA
#> 22 20 NA A0001 <NA> NA
#> 23 21 NA A0001 <NA> NA
#> 24 22 NA A0001 <NA> NA
#> 25 23 NA A0001 <NA> NA
#> 26 24 NA A0001 <NA> NA
#> 27 25 NA A0001 <NA> NA
#> 28 26 NA A0001 <NA> NA
#> 29 27 NA A0001 <NA> NA
#> 30 28 NA A0001 <NA> NA
#> 31 29 NA A0001 <NA> NA
#> 32 30 NA A0001 <NA> NA
#> 33 31 NA A0001 <NA> NA
#> 34 32 NA A0001 <NA> NA
#> 35 33 NA A0001 <NA> NA
#> 36 34 NA A0001 <NA> NA
#> 37 35 NA A0001 <NA> NA
#> 38 36 NA A0001 <NA> NA
#> 39 37 NA A0001 <NA> NA
#> 40 38 NA A0001 <NA> NA
#> 41 39 NA A0001 <NA> NA
#> 42 40 NA A0001 <NA> NA
#> 43 41 NA A0001 <NA> NA
#> 44 42 NA A0001 <NA> NA
#> 45 43 NA A0001 <NA> NA
#> 46 44 NA A0001 <NA> NA
#> 47 45 NA A0001 <NA> NA
#> 48 46 NA A0001 <NA> NA
#> 49 47 NA A0001 <NA> NA
#> 50 48 NA A0001 <NA> NA
# average replicate biology samples within each time window before using join_type = "add_biol"
biol_data3 <- biol_data2 %>%
mutate(month = lubridate::month(date)) %>%
dplyr::group_by(biol_site_id, month) %>%
dplyr::summarise_all(mean)
join_he(biol_data = biol_data3,
flow_stats = flow_stats,
mapping = NULL,
method = "A",
lags = c(0,1),
join_type = "add_biol")
#> flow_site_id win_no start_date end_date win_no_lag0 Q95z_lag0
#> 1 A0001 1 2021-01-01 2021-01-31 1 -0.31871603
#> 2 A0001 2 2021-02-01 2021-02-28 2 0.70614692
#> 3 A0001 3 2021-03-01 2021-03-31 3 0.97917882
#> 4 A0001 4 2021-04-01 2021-04-30 4 0.17013587
#> 5 A0001 5 2021-05-01 2021-05-31 5 -0.75563412
#> 6 A0001 6 2021-06-01 2021-06-30 6 1.23495888
#> 7 A0001 7 2021-07-01 2021-07-31 7 0.69520028
#> 8 A0001 8 2021-08-01 2021-08-31 8 0.52935726
#> 9 A0001 9 2021-09-01 2021-09-30 9 -0.23660759
#> 10 A0001 10 2021-10-01 2021-10-31 10 0.08601627
#> 11 A0001 11 2021-11-01 2021-11-30 11 -0.61904164
#> 12 A0001 12 2021-12-01 2021-12-31 12 -2.47099491
#> 13 A0001 13 2022-01-01 2022-01-31 13 NA
#> 14 A0001 14 2022-02-01 2022-02-28 14 NA
#> 15 A0001 15 2022-03-01 2022-03-31 15 NA
#> 16 A0001 16 2022-04-01 2022-04-30 16 NA
#> 17 A0001 17 2022-05-01 2022-05-31 17 NA
#> 18 A0001 18 2022-06-01 2022-06-30 18 NA
#> 19 A0001 19 2022-07-01 2022-07-31 19 NA
#> 20 A0001 20 2022-08-01 2022-08-31 20 NA
#> 21 A0001 21 2022-09-01 2022-09-30 21 NA
#> 22 A0001 22 2022-10-01 2022-10-31 22 NA
#> 23 A0001 23 2022-11-01 2022-11-30 23 NA
#> 24 A0001 24 2022-12-01 2022-12-31 24 NA
#> 25 A0001 25 2023-01-01 2023-01-31 25 NA
#> 26 A0001 26 2023-02-01 2023-02-28 26 NA
#> 27 A0001 27 2023-03-01 2023-03-31 27 NA
#> 28 A0001 28 2023-04-01 2023-04-30 28 NA
#> 29 A0001 29 2023-05-01 2023-05-31 29 NA
#> 30 A0001 30 2023-06-01 2023-06-30 30 NA
#> 31 A0001 31 2023-07-01 2023-07-31 31 NA
#> 32 A0001 32 2023-08-01 2023-08-31 32 NA
#> 33 A0001 33 2023-09-01 2023-09-30 33 NA
#> 34 A0001 34 2023-10-01 2023-10-31 34 NA
#> 35 A0001 35 2023-11-01 2023-11-30 35 NA
#> 36 A0001 36 2023-12-01 2023-12-31 36 NA
#> 37 A0001 37 2024-01-01 2024-01-31 37 NA
#> 38 A0001 38 2024-02-01 2024-02-29 38 NA
#> 39 A0001 39 2024-03-01 2024-03-31 39 NA
#> 40 A0001 40 2024-04-01 2024-04-30 40 NA
#> 41 A0001 41 2024-05-01 2024-05-31 41 NA
#> 42 A0001 42 2024-06-01 2024-06-30 42 NA
#> 43 A0001 43 2024-07-01 2024-07-31 43 NA
#> 44 A0001 44 2024-08-01 2024-08-31 44 NA
#> 45 A0001 45 2024-09-01 2024-09-30 45 NA
#> 46 A0001 46 2024-10-01 2024-10-31 46 NA
#> 47 A0001 47 2024-11-01 2024-11-30 47 NA
#> 48 A0001 48 2024-12-01 2024-12-31 48 NA
#> 49 A0001 49 2025-01-01 2025-01-31 49 NA
#> win_no_lag1 Q95z_lag1 biol_site_id month date metric
#> 1 0 NA A0001 NA <NA> NA
#> 2 1 -0.31871603 A0001 NA <NA> NA
#> 3 2 0.70614692 A0001 4 2021-04-15 0.80
#> 4 3 0.97917882 A0001 NA <NA> NA
#> 5 4 0.17013587 A0001 NA <NA> NA
#> 6 5 -0.75563412 A0001 NA <NA> NA
#> 7 6 1.23495888 A0001 NA <NA> NA
#> 8 7 0.69520028 A0001 9 2021-09-16 0.65
#> 9 8 0.52935726 A0001 NA <NA> NA
#> 10 9 -0.23660759 A0001 NA <NA> NA
#> 11 10 0.08601627 A0001 NA <NA> NA
#> 12 11 -0.61904164 A0001 NA <NA> NA
#> 13 12 -2.47099491 A0001 NA <NA> NA
#> 14 13 NA A0001 NA <NA> NA
#> 15 14 NA A0001 NA <NA> NA
#> 16 15 NA A0001 NA <NA> NA
#> 17 16 NA A0001 NA <NA> NA
#> 18 17 NA A0001 NA <NA> NA
#> 19 18 NA A0001 NA <NA> NA
#> 20 19 NA A0001 NA <NA> NA
#> 21 20 NA A0001 NA <NA> NA
#> 22 21 NA A0001 NA <NA> NA
#> 23 22 NA A0001 NA <NA> NA
#> 24 23 NA A0001 NA <NA> NA
#> 25 24 NA A0001 NA <NA> NA
#> 26 25 NA A0001 NA <NA> NA
#> 27 26 NA A0001 NA <NA> NA
#> 28 27 NA A0001 NA <NA> NA
#> 29 28 NA A0001 NA <NA> NA
#> 30 29 NA A0001 NA <NA> NA
#> 31 30 NA A0001 NA <NA> NA
#> 32 31 NA A0001 NA <NA> NA
#> 33 32 NA A0001 NA <NA> NA
#> 34 33 NA A0001 NA <NA> NA
#> 35 34 NA A0001 NA <NA> NA
#> 36 35 NA A0001 NA <NA> NA
#> 37 36 NA A0001 NA <NA> NA
#> 38 37 NA A0001 NA <NA> NA
#> 39 38 NA A0001 NA <NA> NA
#> 40 39 NA A0001 NA <NA> NA
#> 41 40 NA A0001 NA <NA> NA
#> 42 41 NA A0001 NA <NA> NA
#> 43 42 NA A0001 NA <NA> NA
#> 44 43 NA A0001 NA <NA> NA
#> 45 44 NA A0001 NA <NA> NA
#> 46 45 NA A0001 NA <NA> NA
#> 47 46 NA A0001 NA <NA> NA
#> 48 47 NA A0001 NA <NA> NA
#> 49 48 NA A0001 NA <NA> NA