Calculating expected scores for macroinvertebrate indices using the RICT2 model
predict_indices.RdThe predict_indices function mirrors the functionality of the RICT model available on the MS Azure platform (https://gallery.azure.ai/Experiment/RICT-package-2). Specifically, it uses  environmental (ENV) data from Ecology Data Explorer to generate expected scores under minimally impacted reference conditions for 80 indices, plus probabilities for RIVPACS end-groups. The prediction functionality applies the 'rict_predict()' function from the AquaMetrics RICT package (https://github.com/aquaMetrics/rict). No classification is undertaken.
Usage
predict_indices(env_data = x, save = FALSE, save_dir = getwd())Arguments
- env_data
- A data frame or tibble containing site-level environmental data in Environment Agency Ecology Data Explorer format (as produced by the import_env function). 
- file_format
- Format in which env_data is supplied: "EDE" - environmental data is formatted as downloaded from the EA's Ecology Data Explorer; "RICT" - environmental data is in the RICT template format. 
- save
- Specifies whether or not expected indices data should be saved as a rds file (for future use); Default = TRUE. 
- save_dir
- Path to folder where expected indices data is to be saved; Default = Current working directory. 
- all_indices
- Boolean. TRUE - Return all indices in output (default); FALSE - only returns WHPT indices (arguement passed to rict::rict_predict). 
Value
Tibble containing expected scores for macroinvertebrate indices plus end-group probabilities. The RICT Technical Specification and the RIVPACS IV End Group Descriptions are available at https://www.fba.org.uk/FBA/Public/Discover-and-Learn/Projects/User%20Guides.aspx
Details
All data validation and transformation (conversion) are done in this function using functions predefined in HelperFunctionsv1.R. Predictions are made using PredictionfunctionsV2.R.
The function will modify the standard RICT output, renaming "SITE" as "biol_site_id" (standardised column header for biology sites).
References
FBA, 2020. River Invertebrate Classification Tool (RICT2) User Guide V1.5 (2020) Available at: https://www.fba.org.uk/FBA/Public/Discover-and-Learn/Projects/User%20Guides.aspx
Examples
# Generate expected scores for macroinvertebrate indices, using environmental data downloaded from Ecology Data Explorer for site(s) of interest.
# Save dataset as .RDS file.
 predict_indices(env_data = env_data,
                  file_format = "EDE",
                  save = TRUE)
#> Warning: There was 1 warning in `dplyr::summarise()`.
#> ℹ In argument: `across(.fns = (~sum(is.na(.x))))`.
#> Caused by warning:
#> ! Using `across()` without supplying `.cols` was deprecated in dplyr 1.1.0.
#> ℹ Please supply `.cols` instead.
#> Variables for the 'physical' model detected - applying relevant checks. 
#> Grid reference values detected for 'GB' - applying relevant checks.
#> Success, all validation checks passed!
#> Warning: row names were found from a short variable and have been discarded
#> # A tibble: 60 × 156
#>    biol_site_id LATITUDE LONGITUDE LOG.ALTITUDE LOG.DISTANCE.FROM.SO…¹ LOG.WIDTH
#>    <chr>           <dbl>     <dbl>        <dbl>                  <dbl>     <dbl>
#>  1 77599            50.7    -2.64         1.68                  -0.301     0.398
#>  2 10708            50.5    -4.50         2.26                   1.16      0.898
#>  3 34352            51.7     0.258        1.61                   1.46      0.602
#>  4 100582           50.9    -1.77         1.43                   1.02      0.826
#>  5 10992            51.2    -3.35         1.30                   1.17      0.806
#>  6 10784            50.4    -4.33         0.602                  1.13      0.643
#>  7 52504            53.3    -0.995        1.36                   1.34      0.845
#>  8 53819            53.2    -1.71         2.13                   1.37      1.18 
#>  9 54017            52.6    -2.40         1.53                   1.56      0.778
#> 10 34343            51.8    -0.148        1.69                   1.26      0.954
#> # ℹ 50 more rows
#> # ℹ abbreviated name: ¹LOG.DISTANCE.FROM.SOURCE
#> # ℹ 150 more variables: LOG.DEPTH <dbl>, MEAN.SUBSTRATUM <dbl>,
#> #   DISCHARGE.CATEGORY <dbl>, ALKALINITY <dbl>, LOG.ALKALINITY <dbl>,
#> #   LOG.SLOPE <dbl>, MEAN.AIR.TEMP <dbl>, AIR.TEMP.RANGE <dbl>, p1 <dbl>,
#> #   p2 <dbl>, p3 <dbl>, p4 <dbl>, p5 <dbl>, p6 <dbl>, p7 <dbl>, p8 <dbl>,
#> #   p9 <dbl>, p10 <dbl>, p11 <dbl>, p12 <dbl>, p13 <dbl>, p14 <dbl>, …