Skip to contents

This function processes GBIF (Global Biodiversity Information Facility) data, which includes downloading, cleaning, and saving the data in various formats. The function handles data chunking, merging, and produces summary maps as well as species-specific data.

Usage

GBIF_Process(
  FromHPC = TRUE,
  EnvFile = ".env",
  Renviron = ".Renviron",
  NCores = 6,
  RequestData = TRUE,
  DownloadData = TRUE,
  SplitChunks = TRUE,
  Overwrite = FALSE,
  DeleteChunks = TRUE,
  ChunkSize = 50000,
  Boundaries = c(-30, 50, 25, 75),
  StartYear = 1981
)

Arguments

FromHPC

Logical indicating whether the work is being done from HPC, to adjust file paths accordingly. Default: TRUE.

EnvFile

Character. The path to the environment file containing variables required by the function. Default is ".env".

Renviron

Character. The path to the .Renviron file containing GBIF login credentials (email, user, password).

NCores

Numeric. Number of cores to use for parallel processing. Defaults to 6.

RequestData

Logical. If TRUE, requests data from GBIF. If FALSE, loads a previously requested data set from GBIF_Request.RData and StatusDetailed.RData files. Defaults to TRUE.

DownloadData

Logical. If TRUE, downloaded data is stored on disk. Defaults to TRUE.

SplitChunks

Logical. If TRUE, splits the downloaded data into smaller chunks for easier processing.

Overwrite

Logical; indicating whether to process the current chunk file if it has already processed and saved as *.RData file. This helps to continue working on previously processed chunks if the previous try failed, e.g. due to memory issue.

DeleteChunks

Logical. If TRUE, delete the chunk files. Defaults to TRUE.

ChunkSize

Integer. The number of records per chunk when splitting the data. Default is 50,000.

Boundaries

Numeric vector of length 4. Specifies geographical boundaries for the requested GBIF data in the order: Left, Right, Bottom, Top. Defaults to c(-30, 50, 25, 75).

StartYear

Numeric. The starting year for the occurrence data. Only records from this year onward will be requested from GBIF. Default is 1981, which matches the year ranges of CHELSA current climate data.

Value

Saves multiple RData and Excel files containing the processed data and summary maps. Also saves a JPEG file with summary plots.

Note

This function depends on the following functions: GBIF_Download for requesting, downloading and splitting data into chunks; GBIF_ReadChunk to process chunk files; and GBIF_SpData to prepare species-specific data.

Author

Ahmed El-Gabbas