Skip to contents

This function processes environmental and species presence data to prepare habitat-specific datasets for use in Hmsc models. It checks input arguments, reads environment variables from a file, verifies paths, loads and filters species data based on habitat type and minimum presence grid cells per species, and merges various environmental layers (e.g., CHELSA Bioclimatic variables, habitat coverage, road and railway intensity, sampling efforts) into a single dataset. Processed data is saved to disk as an *.RData file.

Usage

Mod_PrepData(
  Hab_Abb = NULL,
  MinEffortsSp = 100L,
  ExcludeCult = TRUE,
  ExcludeZeroHabitat = TRUE,
  PresPerSpecies = 80L,
  EnvFile = ".env",
  Path_Model = NULL,
  VerboseProgress = TRUE,
  FromHPC = TRUE
)

Arguments

Hab_Abb

Character. Abbreviation for the habitat type (based on SynHab) for which to prepare data. Valid values are 0, 1, 2, 3, 4a, 4b, 10, 12a, 12b. If Hab_Abb = 0, data is prepared irrespective of the habitat type. For more details, see Pysek et al..

MinEffortsSp

Integer specifying the minimum number of vascular plant species per grid cell (from GBIF data) required for inclusion in the models. This is to exclude grid cells with very little sampling efforts. Defaults to 100.

ExcludeCult

Logical. Indicates whether to exclude countries with cultivated or casual observations per species. Defaults to TRUE.

ExcludeZeroHabitat

Logical. Indicates whether to exclude grid cells with zero habitat coverage. Defaults to TRUE.

PresPerSpecies

Integer. The minimum number of presence grid cells for a species to be included in the analysis. The number of presence grid cells per species is calculated after discarding grid cells with low sampling efforts (MinEffortsSp). Defaults to 80.

EnvFile

Character. Path to the environment file containing paths to data sources. Defaults to .env.

Path_Model

Character. Path where the output file should be saved.

VerboseProgress

Logical. Indicates whether progress messages should be displayed. Defaults to TRUE.

FromHPC

Logical indicating whether the work is being done from HPC, to adjust file paths accordingly. Default: TRUE.

Value

a tibble containing modelling data.

Details

The function reads the following environment variables:

  • DP_R_Grid (if FromHPC = TRUE) or DP_R_Grid_Local (if FromHPC = FALSE). The function reads the content of the Grid_10_Land_Crop.RData and Grid_10_Land_Crop_sf_Country.RData files

  • DP_R_Grid_Ref or DP_R_Grid_Ref_Local: The function reads the content of Grid_10_sf.RData file from this path.

  • DP_R_PA or DP_R_PA_Local: The function reads the contents of the Sp_PA_Summary_DF.RData file from this path.

  • DP_R_CLC_Summary / DP_R_CLC_Summary_Local: Path containing the PercCov_SynHab_Crop.RData file. This file contains maps for the percentage coverage of each SynHab habitat type per grid cell.

  • DP_R_CHELSA_Output / DP_R_CHELSA_Output_Local: Path for processed CHELSA data.

  • DP_R_Roads / DP_R_Roads_Local: Path for processed road data. The function reads the contents of: Road_Length.RData for the total length of any road type per grid cell.

  • DP_R_Railway / DP_R_Railway_Local: Path for processed railway data. The function reads the contents of: Railway_Length.RData for the total length of any railway type per grid cell.

  • DP_R_Efforts / DP_R_Efforts_Local: Path for processed sampling efforts analysis. The function reads the content of Bias_GBIF_SummaryR.RData file containing the total number of GBIF vascular plant observations per grid cell.

The current models are fitted for 8 habitat types see Pysek et al.:

  • 1. Forests – closed vegetation dominated by deciduous or evergreen trees

  • 2. Open forests – woodlands with canopy openings created by environmental stress or disturbance, including forest edges

  • 3. Scrub – shrublands maintained by environmental stress (aridity) or disturbance

  • 4a. Natural grasslands – grasslands maintained by climate (aridity, unevenly distributed precipitation), herbivores or environmental stress (aridity, instability or toxicity of substrate)

  • 4b. Human-maintained grasslands – grasslands dependent on regular human-induced management (mowing, grazing by livestock, artificial burning)

  • 10. Wetland – sites with the permanent or seasonal influence of moisture, ranging from oligotrophic to eutrophic

  • 12a. Ruderal habitats – anthropogenically disturbed or eutrophicated sites, where the anthropogenic disturbance or fertilization is typically a side-product and not the aim of the management

  • 12b. Agricultural habitats – synanthropic habitats directly associated with growing of agricultural products, thus dependent on specific type of management (ploughing, fertilization)

The following habitat types are excluded from the analysis:

  • 5. Sandy – dunes and other habitats on unstable sandy substrate, stressed by low nutrients, drought and disturbed by sand movement

  • 6. Rocky – cliffs and rock outcrops with very shallow or no soil

  • 7. Dryland – habitats in which drought stress limits vegetation development

  • 8. Saline – habitats stressed by high soil salinity

  • 9. Riparian – a mosaic of wetlands, grasslands, tall-forb stands, scrub and open forests in stream corridors

  • 11. Aquatic – water bodies and streams with submerged and floating plant species

Author

Ahmed El-Gabbas