Process EASIN data for the IAS-pDT
Source: R/DWF_EASIN_Process.R
, R/DWF_EASIN_Taxonomy.R
, R/DWF_EASIN_Down.R
, and 1 more
EASIN_data.Rd
Extracts, processes, and visualizes data from the European Alien Species Information Network (EASIN) for the
Invasive Alien Species prototype Digital Twin (IAS-pDT
). Manages taxonomy,
occurrence data, and plots, handling API pagination and server limits.
Orchestrated by EASIN_Process()
with helpers EASIN_Taxonomy()
,
EASIN_Down()
, and EASIN_Plot()
.
Usage
EASIN_Process(
ExtractTaxa = TRUE,
ExtractData = TRUE,
NDownTries = 10L,
NCores = 6L,
SleepTime = 10L,
NSearch = 1000L,
EnvFile = ".env",
DeleteChunks = TRUE,
StartYear = 1981L,
Plot = TRUE
)
EASIN_Taxonomy(
EnvFile = ".env",
Kingdom = "Plantae",
Phylum = "Tracheophyta",
NSearch = 100
)
EASIN_Down(
SpKey,
Timeout = 200,
Verbose = FALSE,
EnvFile = ".env",
NSearch = 1000,
Attempts = 10,
SleepTime = 5,
DeleteChunks = TRUE,
ReturnData = FALSE
)
EASIN_Plot(EnvFile = ".env")
Arguments
- ExtractTaxa
Logical. If
TRUE
, extracts taxonomy usingEASIN_Taxonomy()
. Default:TRUE
.- ExtractData
Logical.If
TRUE
, downloads occurrence data withEASIN_Down()
. Default:TRUE
.- NDownTries
Integer. Retry attempts for downloads. Default:
10
.- NCores
Integer. Number of CPU cores to use for parallel processing. Default: 6. The maximum number of allowed cores are 8.
- SleepTime
Integer. Number of seconds to pause between each data retrieval request to prevent overloading the server. Default: 5 second.
- NSearch
Integer. Number of records to attempt to retrieve per request. Default: 1000, which is the current maximum allowed by the API.
- EnvFile
Character. Path to the environment file containing paths to data sources. Defaults to
.env
.- DeleteChunks
Logical. Whether to delete temporary files for data chunks from the
FileParts
subdirectory. Defaults toTRUE
.- StartYear
Integer. Earliest year for occurrence data (excludes earlier records). Default:
1981
(aligned with CHELSA climate data).- Plot
Logical. If
TRUE
, generates plots viaEASIN_Plot()
. Default:TRUE
.- Kingdom
Character. Taxonomic kingdom to query. Default:
"Plantae"
.- Phylum
Character. Taxonomic phylum within kingdom. Default:
"Tracheophyta"
- SpKey
Character. EASIN taxon ID for which data is to be retrieved. This parameter cannot be
NULL
.- Timeout
Integer. Download timeout in seconds. Default:
200
.- Verbose
Logical. If
TRUE
, prints progress messages. Default:FALSE
.- Attempts
Integer. Max download attempts per chunk. Default:
10
.- ReturnData
Logical. If
TRUE
, returns data as a dataframe; otherwise, saves to disk and returnsinvisible(NULL)
. Default:FALSE
.
Note
Uses a static RDS file with EASIN-GBIF taxonomic standardization, prepared by Marina Golivets (Feb 2024).
Functions details
EASIN_Process()
: Orchestrates taxonomy extraction, data downloads, and plotting for EASIN species data.EASIN_Taxonomy()
: Fetches taxonomy data in chunks via the EASIN API, filtered by kingdom and phylum. Returns a tibble.EASIN_Down()
: Downloads occurrence data for a given EASIN ID, handling pagination and pauses. Returns a dataframe ifReturnData = TRUE
, elseinvisible(NULL)
.EASIN_Plot()
: Creates summary plots (observations count, species count, distribution by partner) as JPEGs. Returnsinvisible(NULL)
.