Model pipeline for Hmsc analysis — Mod

This function sets up and runs an analysis pipeline for Hmsc models. It includes steps for environment setup, loading packages, managing SLURM refits, merging MCMC chains, convergence diagnostics, model summaries, spatial predictions, response curve generation, and variance partitioning.

Usage

Mod_Postprocess(
  ModelDir = NULL,
  Hab_Abb = NULL,
  NCores = 8L,
  FromHPC = TRUE,
  EnvFile = ".env",
  Path_Hmsc = NULL,
  MemPerCpu = NULL,
  Time = NULL,
  FromJSON = FALSE,
  GPP_Dist = NULL,
  Tree = "Tree",
  Samples = 1000L,
  Thin = NULL,
  N_Grid = 50L,
  NOmega = 1000L,
  UseTF = TRUE,
  TF_Environ = NULL,
  TF_use_single = FALSE,
  LF_NCores = NCores,
  LF_Check = FALSE,
  LF_Temp_Cleanup = TRUE,
  Temp_Cleanup = TRUE,
  CC_Models = c("GFDL-ESM4", "IPSL-CM6A-LR", "MPI-ESM1-2-HR", "MRI-ESM2-0",
    "UKESM1-0-LL"),
  CC_Scenario = c("ssp126", "ssp370", "ssp585"),
  Pred_Clamp = TRUE,
  Fix_Efforts = "q90",
  Fix_Rivers = "q90",
  Pred_NewSites = TRUE,
  CVName = c("CV_Dist", "CV_Large")
)

Arguments

ModelDir: String. Path to the root directory of the fitted models without the trailing slash. Two folders will be created Model_Fitted and Model_Coda to store merged model and coda objects, respectively.
Hab_Abb: Character. Habitat abbreviation indicating the specific SynHab habitat type for which data will be prepared. Valid values are 0, 1, 2, 3, 4a, 4b, 10, 12a, 12b. For more details, see Pysek et al..
NCores: Integer specifying the number of parallel cores for parallelization. Default: 8 cores.
FromHPC: Logical indicating whether the work is being done from HPC, to adjust file paths accordingly. Default: TRUE.
EnvFile: Character. Path to the environment file containing paths to data sources. Defaults to .env.
Path_Hmsc: String. Path for the Hmsc-HPC.
MemPerCpu: String. Memory per CPU allocation for the SLURM job. Example: 32G for 32 gigabytes. Defaults to NULL. If not provided, the function will throw an error.
Time: String. Duration for which the job should run. Example: 01:00:00 for one hour. If not provided, the function will throw an error.
FromJSON: Logical. Indicates whether to convert loaded models from JSON format before reading. Defaults to FALSE.
GPP_Dist: Integer specifying the distance in kilometers between knots for GPP models.
Tree: Character string specifying if phylogenetic tree was used in the model. Valid values are "Tree" or "NoTree". Default is "Tree".
Samples: Integer specifying the value for the number of MCMC samples in the selected model. Defaults to 1000.
Thin: Integer specifying the value for thinning in the selected model.
N_Grid: Integer specifying the number of points along the gradient for continuous focal variables. Defaults to 50. See Hmsc::constructGradient for more details.
NOmega: An integer specifying the number of species to be sampled for the Omega parameter transformation. Defaults to 100.
UseTF: Logical indicating whether to use TensorFlow for calculations. Defaults to TRUE.
TF_Environ: Character string specifying the path to the Python environment. Defaults to NULL. This argument is required if UseTF is TRUE.
TF_use_single: Logical indicating whether to use single precision for the TF calculations. Defaults to FALSE.
LF_NCores: Integer specifying the number of cores to use for parallel processing. Defaults to 8.
LF_Check: Logical. If TRUE, the function checks if the output files are already created and valid. If FALSE, the function will only check if the files exist without checking their integrity. Default is FALSE.
LF_Temp_Cleanup: Logical indicating whether to delete temporary files in the Temp_Dir after finishing the LF predictions.
Temp_Cleanup: logical, indicating whether to clean up temporary files. Defaults to TRUE.
CC_Models: Character vector. Specifies the climate models for future predictions. Default: c("GFDL-ESM4", "IPSL-CM6A-LR", "MPI-ESM1-2-HR", "MRI-ESM2-0", "UKESM1-0-LL"). Note: This parameter is temporary and may be removed in future updates.
CC_Scenario: Character vector. Specifies the climate scenarios for future predictions. Default: c("ssp126", "ssp370", "ssp585"). Note: This parameter is temporary and may be removed in future updates.
Pred_Clamp: Logical indicating whether to clamp the sampling efforts at a single value. Defaults to TRUE. If TRUE, the Fix_Efforts argument must be provided.
Fix_Efforts: Numeric or character. Defines the value to fix sampling efforts less than the provided value. If numeric, the value is directly used (log₁₀ scale). If character, it can be median, mean, max, or q90 (q0% Quantile). Using max can reflect extreme values caused by rare, highly sampled locations (e.g., urban centers or popular natural reserves). While using 90% quantile avoid such extreme grid cells while still capturing areas with high sampling effort. This argument is mandatory when Pred_Clamp is set to TRUE.
Fix_Rivers: Numeric or character. Similar to Fix_Efforts, but for fixing the length of rivers. If numeric, the value is directly used (log₁₀ scale). If character, it can be median, mean, max, q90 (90% quantile). It can be also NULL for not fixing the river length predictor. Defaults to q90.
Pred_NewSites: Logical indicating whether to predict habitat suitability at new sites. Default: TRUE. Note: This parameter is temporary and will be removed in future updates.
CVName: Character vector specifying the name of the column(s) in the model input data (see Mod_PrepData and GetCV) to be used to cross-validate the models. The function allows the possibility of using more than one way of assigning grid cells into cross-validation folders. If multiple names are provided, separate cross-validation models will be fitted for each column. Currently, there are three cross-validation strategies, created using the Mod_PrepData: CV_SAC, CV_Dist, and CV_Large (see GetCV).

Author

Ahmed El-Gabbas