This function processes GBIF (Global Biodiversity Information Facility) data, which includes downloading, cleaning, and saving the data in various formats. The function handles data chunking, merging, and produces summary maps as well as species-specific data.
Usage
GBIF_Process(
FromHPC = TRUE,
EnvFile = ".env",
Renviron = ".Renviron",
NCores = 6,
RequestData = TRUE,
DownloadData = TRUE,
SplitChunks = TRUE,
Overwrite = FALSE,
DeleteChunks = TRUE,
ChunkSize = 50000,
Boundaries = c(-30, 50, 25, 75),
StartYear = 1981
)
Arguments
- FromHPC
Logical indicating whether the work is being done from HPC, to adjust file paths accordingly. Default:
TRUE
.- EnvFile
Character. The path to the environment file containing variables required by the function. Default is ".env".
- Renviron
Character. The path to the
.Renviron
file containing GBIF login credentials (email, user, password).- NCores
Numeric. Number of cores to use for parallel processing. Defaults to 6.
- RequestData
Logical. If
TRUE
, requests data from GBIF. IfFALSE
, loads a previously requested data set fromGBIF_Request.RData
andStatusDetailed.RData
files. Defaults toTRUE
.- DownloadData
Logical. If
TRUE
, downloaded data is stored on disk. Defaults toTRUE
.- SplitChunks
Logical. If
TRUE
, splits the downloaded data into smaller chunks for easier processing.- Overwrite
Logical; indicating whether to process the current chunk file if it has already processed and saved as
*.RData
file. This helps to continue working on previously processed chunks if the previous try failed, e.g. due to memory issue.- DeleteChunks
Logical. If
TRUE
, delete the chunk files. Defaults toTRUE
.- ChunkSize
Integer. The number of records per chunk when splitting the data. Default is 50,000.
- Boundaries
Numeric vector of length 4. Specifies geographical boundaries for the requested GBIF data in the order: Left, Right, Bottom, Top. Defaults to c(-30, 50, 25, 75).
- StartYear
Numeric. The starting year for the occurrence data. Only records from this year onward will be requested from GBIF. Default is
1981
, which matches the year ranges of CHELSA current climate data.
Value
Saves multiple RData and Excel files containing the processed data and summary maps. Also saves a JPEG file with summary plots.
Note
This function depends on the following functions: GBIF_Download for requesting, downloading and splitting data into chunks; GBIF_ReadChunk to process chunk files; and GBIF_SpData to prepare species-specific data.