Skip to contents

The Mod_SLURM function generates SLURM job submission scripts for fitting Hmsc-HPC models in an HPC environment. Additionally, Mod_SLURM_Refit creates SLURM scripts for refitting models that failed or were not previously fitted.

Usage

Mod_SLURM(
  ModelDir = NULL,
  JobName = NULL,
  CatJobInfo = TRUE,
  ntasks = 1L,
  CpusPerTask = 1L,
  GpusPerNode = 1L,
  MemPerCpu = NULL,
  Time = NULL,
  Partition = "small-g",
  EnvFile = ".env",
  Path_Hmsc = NULL,
  Command_Prefix = "Commands2Fit",
  SLURM_Prefix = "Bash_Fit",
  Path_SLURM_Out = NULL
)

Mod_SLURM_Refit(
  ModelDir = NULL,
  NumArrayJobs = 210L,
  JobName = NULL,
  MemPerCpu = NULL,
  Time = NULL,
  Partition = "small-g",
  EnvFile = ".env",
  CatJobInfo = TRUE,
  ntasks = 1L,
  CpusPerTask = 1L,
  GpusPerNode = 1L,
  PrepSLURM = TRUE,
  Path_Hmsc = NULL,
  Refit_Prefix = "Commands2Refit",
  SLURM_Prefix = "Bash_Refit"
)

Arguments

ModelDir

Character. Path to the root directory of the fitted model.

JobName

Character. Name of the submitted job(s).

CatJobInfo

Logical. If TRUE, additional bash commands are included to print job-related information. Default: TRUE.

ntasks

Integer. Number of tasks to allocate for the job (#SBATCH --ntasks). Default: 1.

CpusPerTask

Integer. Number of CPU cores allocated per task (#SBATCH --cpus-per-task). Default: 1.

GpusPerNode

Integer. Number of GPUs requested per node (#SBATCH --gpus-per-node). Default: 1.

MemPerCpu

Character. Memory allocation per CPU core. Example: "32G" for 32 gigabytes. Required — if not provided, the function throws an error.

Time

Character. Maximum allowed runtime for the job. Example: "01:00:00" for one hour. Required — if not provided, the function throws an error.

Partition

Character. Name of the SLURM partition to submit the job to. Default: "small-g", for running the array jobs on the GPU.

EnvFile

Character. Path to the environment file containing paths to data sources. Defaults to .env.

Path_Hmsc

Character. Path to the Hmsc-HPC installation.

Command_Prefix

Character.Prefix for the bash commands used in job execution. Default: "Commands2Fit".

SLURM_Prefix

Character. Prefix for the generated SLURM script filenames.

Path_SLURM_Out

Character. Directory where SLURM script(s) will be saved. If NULL (default), the function derives the path from ModelDir.

NumArrayJobs

Integer. Number of jobs per SLURM script file. In LUMI HPC, there is a limit of 210 submitted jobs per user for the small-g partition. This argument is used to split the jobs into multiple SLURM scripts if needed. Default: 210. See LUMI documentation for more details.

PrepSLURM

Logical. Whether to prepare SLURM command files. If TRUE (default), the SLURM commands will be saved to disk using the Mod_SLURM function.

Refit_Prefix

Character. Prefix for files containing commands to refit failed or incomplete models.

Value

This function does not return a value. Instead, it generates and writes SLURM script files to disk for model fitting and refitting.

Author

Ahmed El-Gabbas