Skip to contents

Extracts, processes, and visualizes occurrence data from the Global Biodiversity Information Facility (GBIF) for the Invasive Alien Species prototype Digital Twin (IAS-pDT). Orchestrated by GBIF_Process(), it requests, downloads, cleans, chunks, and maps species data using helper functions.

Usage

GBIF_Process(
  EnvFile = ".env",
  Renviron = ".Renviron",
  NCores = 6L,
  Request = TRUE,
  Download = TRUE,
  SplitChunks = TRUE,
  Overwrite = FALSE,
  DeleteChunks = TRUE,
  ChunkSize = 50000L,
  Boundaries = c(-30, 50, 25, 75),
  StartYear = 1981L
)

GBIF_Check(Renviron = ".Renviron")

GBIF_Download(
  EnvFile = ".env",
  Renviron = ".Renviron",
  Request = TRUE,
  Download = TRUE,
  SplitChunks = TRUE,
  ChunkSize = 50000L,
  Boundaries = c(-30, 50, 25, 75),
  StartYear = 1981L
)

GBIF_ReadChunk(
  ChunkFile,
  EnvFile = ".env",
  MaxUncert = 10L,
  StartYear = 1981L,
  SaveRData = TRUE,
  ReturnData = FALSE,
  Overwrite = FALSE
)

GBIF_SpData(Species = NULL, EnvFile = ".env", Verbose = TRUE, PlotTag = NULL)

Arguments

EnvFile

Character. Path to the environment file containing paths to data sources. Defaults to .env.

Renviron

Character. Path to .Renviron file with GBIF credentials (GBIF_EMAIL, GBIF_USER, GBIF_PWD). Default: ".Renviron". The credentials must be in the format:

  • GBIF_EMAIL=your_email

  • GBIF_USER=your_username

  • GBIF_PWD=your_password

NCores

Integer. Number of CPU cores to use for parallel processing. Default: 6.

Request

Logical. If TRUE (default), requests GBIF data; otherwise, loads from disk.

Download

Logical. If TRUE (default), downloads and saves GBIF data.

SplitChunks

Logical. If TRUE (default), splits data into chunks for easier processing.

Overwrite

Logical. If TRUE, reprocesses existing .RData chunks. Default: FALSE. This helps to continue working on previously processed chunks if the previous try failed, e.g. due to memory issue.

DeleteChunks

Logical. If TRUE (default), deletes chunk files.

ChunkSize

Integer. Records per data chunk. Default: 50000.

Boundaries

Numeric vector (length 4). GBIF data bounds (Left, Right, Bottom, Top). Default: c(-30, 50, 25, 75).

StartYear

Integer. Earliest collection year to be included. Default is 1981.

ChunkFile

Character. Path of chunk file for processing.

MaxUncert

Numeric. Maximum spatial uncertainty in kilometers. Default: 10.

SaveRData

Logical. If TRUE (default), saves chunk data as .RData.

ReturnData

If TRUE, returns chunk data; otherwise, invisible(NULL). Default: FALSE.

Species

Character. Species name for processing.

Verbose

Logical. If TRUE (default), prints progress messages.

PlotTag

Character. Tag for plot titles.

Note

Relies on a static RDS file listing IAS species, GBIF keys, and metadata, standardized by Marina Golivets (Feb 2024).

Functions details

  • GBIF_Process(): Orchestrates GBIF data requests, downloads, processing, and mapping. Saves RData, Excel, and JPEG summary files.

  • GBIF_Check(): Verifies GBIF credentials in environment or .Renviron. Returns TRUE if valid, else FALSE.

  • GBIF_Download(): Requests and downloads GBIF data (if Download = TRUE), using the specified criteria (taxa, coordinates, time period, and boundaries), splits into small chunks (if SplitChunks = TRUE), and saves metadata. Returns invisible(NULL).

  • GBIF_ReadChunk(): Filters chunk data (spatial/temporal, e.g., spatial uncertainty, collection year, coordinate precision, and taxonomic rank), select relevant columns, and saves as .RData (if SaveRData = TRUE) or returns it (if ReturnData = TRUE). Skips if .RData exists and Overwrite = FALSE.

  • GBIF_SpData(): Converts species-specific data to sf and raster formats, generating distribution maps.

Author

Ahmed El-Gabbas