This function extracts and processes data from the European Alien Species Information Network (EASIN) for vascular plants. This function extracts plant species data from the EASIN database, matches them with a pre-processed standardized list of taxa, and prepares species-specific maps and summary maps. It also supports downloading data in chunks, handling pagination, and retrying failed downloads.
Usage
EASIN_Process(
ExtractTaxa = TRUE,
ExtractData = TRUE,
NDownTries = 10,
NCores = 6,
SleepTime = 10,
NSearch = 1000,
FromHPC = TRUE,
EnvFile = ".env",
DeleteChunks = TRUE,
StartYear = 1981,
Plot = TRUE
)
Arguments
- ExtractTaxa
Logical. If
TRUE
, the function will extract the EASIN taxonomy list using EASIN_Taxonomy. Default isTRUE
.- ExtractData
Logical. If
TRUE
, the function will download EASIN species occurrence data using EASIN_Down. Default isTRUE
.- NDownTries
Integer. Number of attempts to retry downloading data in case of failure. Default is 10.
- NCores
Integer. Number of CPU cores to use for parallel processing. The maximum number of allowed cores are 8. Default is 6.
- SleepTime
Numeric. Time in seconds to wait between download attempts and between chunks. Default is 10 seconds.
- NSearch
Integer. Number of observations or species to download during EASIN taxonomy or data extraction, respectively. Default is 1000.
- FromHPC
Logical indicating whether the work is being done from HPC, to adjust file paths accordingly. Default:
TRUE
.- EnvFile
Character. The path to the environment file containing variables required by the function. Default is ".env".
- DeleteChunks
Logical. If
TRUE
, the function will delete intermediate files after processing. Default isFALSE
.- StartYear
Integer. Minimum year for filtering species occurrence data. Records before this year will be excluded. Default is
1981
, which matches the year ranges of CHELSA current climate data.- Plot
Logical. If
TRUE
, the function will generate summary plots of the processed data using EASIN_Plot. Default isTRUE
.
Value
The function Returns NULL
invisibly after completing the data
extraction, processing, and optional plotting. The function saves multiple
outputs to disk, including the extracted and processed EASIN data,
species-specific data files, and summary statistics. The main outputs are:
EASIN_Taxa.RData
: A dataset containing the standardized EASIN taxonomy.EASIN_Data.RData
: A cleaned and merged dataset of species occurrence data.EASIN_NObs.RData
: A rasterized dataset showing the number of observations per grid cell.EASIN_NObs_PerPartner.RData
: A rasterized dataset showing the number of observations per data partner.EASIN_NSp.RData
: A rasterized dataset showing the number of species per grid cell.EASIN_NSp_PerPartner.RData
: A rasterized dataset showing the number of species per data partner.Species-specific data files, saved as both sf and raster objects.
Note
The function assumes that the necessary environment variables are correctly set up in the specified
.env
file. Users should ensure that all required files and directories are accessible before running the function.The function skips processing (i.e. reuse) species data or data chunks if the data already exist on the raw directory. The function assumes that the contents of this folder should be removed as part of the data workflow. Skipping processing available data can help not to re-download already available data from the EASIN server.
This function depends on the following functions: EASIN_Taxonomy for getting the most recent EASIN taxonomy; EASIN_Down for processing EASIN dataset; and EASIN_Plot for plotting.