Post-processing of fitted models within the IAS-pDT
workflow is conducted across multiple steps, leveraging both CPU and GPU
computations to optimize performance and address memory constraints.
Step 1: CPU
The Mod_Postprocess_1_CPU()
function
initiates the post-processing phase for each habitat type, automating
the following tasks:
Mod_SLURM_Refit()
check for unsuccessful model fits
Mod_Merge_Chains()
merge MCMC chains and saves fitted model and coda objects to .qs2
or .RData files
Convergence_Plot_All()
visualize convergence of rho
, alpha
,
omega
, and beta
parameters across all model
variants; unnecessary for a single variant, this function compares
convergence across models with varying thinning values, both with and
without phylogenetic relationships, or GPP knot distances
Convergence_Plot()
convergence of rho
, alpha
, omega
,
and beta
parameters for a selected model, offering a
detailed view compared to Convergence_Plot_All()
PlotGelman()
visualize Gelman-Rubin-Brooks diagnostics for the selected model
Mod_Summary()
extract and save a summary of the model
Mod_Heatmap_Beta()
generate heatmaps of the beta
parameters
Mod_Heatmap_Omega()
generates heatmaps of the omega
parameter (residual species
associations)
Mod_CV_Fit()
prepare input data for spatial-block cross-validation model fitting
Previous attempts to prepare response curve data, predict at new
sites, and compute variance partitioning using R on CPUs (UFZ Windows
server and LUMI HPC) were hindered by memory limitations. Consequently,
these tasks are offloaded to GPU-based computations using Python and
TensorFlow. The Mod_Postprocess_1_CPU()
function invokes
the following sub-functions to generate commands for GPU execution:
Prepare commands for GPU computations
Predicting latent factors:
Latent factor predictions for response curves and new sampling units
are executed via a TensorFlow script located at
inst/crossprod_solve.py .
For these tasks, the respective R functions export numerous
.qs2 and .feather data files to the TEMP_Pred
subdirectory, essential for GPU computations. They also generate
execution commands saved as LF_RC_Commands_.txt (for
response curves) and LF_NewSites_Commands_.txt (for new
sites).
»» Example LF_RC_Commands.txt file
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch001.feather --path_out RC_c_etaPred_ch001.feather --denom 1200000 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch001.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch002.feather --path_out RC_c_etaPred_ch002.feather --denom 1188081 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch002.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch003.feather --path_out RC_c_etaPred_ch003.feather --denom 1176162 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch003.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch004.feather --path_out RC_c_etaPred_ch004.feather --denom 1164242 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch004.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch005.feather --path_out RC_c_etaPred_ch005.feather --denom 1200000 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch005.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch006.feather --path_out RC_c_etaPred_ch006.feather --denom 413333 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch006.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch007.feather --path_out RC_c_etaPred_ch007.feather --denom 1188081 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch007.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch008.feather --path_out RC_c_etaPred_ch008.feather --denom 425253 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch008.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch009.feather --path_out RC_c_etaPred_ch009.feather --denom 1176162 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch009.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch010.feather --path_out RC_c_etaPred_ch010.feather --denom 437172 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch010.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch011.feather --path_out RC_c_etaPred_ch011.feather --denom 1200000 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch011.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch012.feather --path_out RC_c_etaPred_ch012.feather --denom 401414 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch012.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch013.feather --path_out RC_c_etaPred_ch013.feather --denom 1164242 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch013.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch014.feather --path_out RC_c_etaPred_ch014.feather --denom 449091 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch014.log 2>&1
python3 crossprod_solve.py --s1 RC_c_s1.feather --s2 RC_c_s2.feather --post_eta RC_c_postEta_ch015.feather --path_out RC_c_etaPred_ch015.feather --denom 1152323 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> RC_c_etaPred_ch015.log 2>&1
»» Example LF_NewSites_Commands.txt file
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch001.feather --path_out LF_3_Test_etaPred_ch001.feather --denom 225758 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch001.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch002.feather --path_out LF_3_Test_etaPred_ch002.feather --denom 211111 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch002.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch003.feather --path_out LF_3_Test_etaPred_ch003.feather --denom 240404 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch003.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch004.feather --path_out LF_3_Test_etaPred_ch004.feather --denom 196465 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch004.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch005.feather --path_out LF_3_Test_etaPred_ch005.feather --denom 196465 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch005.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch006.feather --path_out LF_3_Test_etaPred_ch006.feather --denom 211111 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch006.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch007.feather --path_out LF_3_Test_etaPred_ch007.feather --denom 225758 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch007.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch008.feather --path_out LF_3_Test_etaPred_ch008.feather --denom 181818 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch008.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch009.feather --path_out LF_3_Test_etaPred_ch009.feather --denom 255051 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch009.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch010.feather --path_out LF_3_Test_etaPred_ch010.feather --denom 240404 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch010.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch011.feather --path_out LF_3_Test_etaPred_ch011.feather --denom 167172 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch011.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch012.feather --path_out LF_3_Test_etaPred_ch012.feather --denom 269697 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch012.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch013.feather --path_out LF_3_Test_etaPred_ch013.feather --denom 255051 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch013.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch014.feather --path_out LF_3_Test_etaPred_ch014.feather --denom 269697 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch014.log 2>&1
python3 crossprod_solve.py --s1 LF_3_Test_s1.feather --s2 LF_3_Test_s2.feather --post_eta LF_3_Test_postEta_ch015.feather --path_out LF_3_Test_etaPred_ch015.feather --denom 181818 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> LF_3_Test_etaPred_ch015.log 2>&1
Response curves
Predicting at new sites
Predict_Maps()
prepares GPU computations for new site
predictions when LF_Only = TRUE
and
LF_Commands_Only = TRUE
.
Computing variance partitioning:
Variance partitioning computations on GPUs are executed using
TensorFlow scripts at inst/VP_geta.py ,
inst/VP_getf.py , and inst/VP_gemu.py .
VarPar_Compute()
exports required files to the
TEMP_VP subdirectory, including numerous .qs2 and
.feather files, and generates execution commands saved as
VP_A_Command.txt , VP_F_Command.txt , and
VP_mu_Command.txt .
Combining commands for GPU computations
After executing Mod_Postprocess_1_CPU()
for all habitat
types, the Mod_Prep_TF()
function
consolidates batch scripts for GPU computations across all habitat
types:
It aggregates script files containing commands for response curves
and latent factor predictions, splitting them into multiple scripts
(TF_Chunk_*.txt ) for batch processing, and generates a SLURM
script (LF_SLURM.slurm ) for latent factor predictions.
»» Example TF_Chunk_*.txt file
#!/bin/bash
# Load TensorFlow module and configure environment
ml use /appl/local/csc/modulefiles
ml tensorflow
export TF_CPP_MIN_LOG_LEVEL=3
export TF_ENABLE_ONEDNN_OPTS=0
# Verify GPU availability
python3 -c "import tensorflow as tf; print(\"Num GPUs Available:\", len(tf.config.list_physical_devices(\"GPU\")))"
# 20 commands to be executed:
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch001.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch001.feather' --denom 50000 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch001.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch002.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch002.feather' --denom 1470707 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch002.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch003.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch003.feather' --denom 1485354 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch003.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch004.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch004.feather' --denom 1456061 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch004.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch005.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch005.feather' --denom 1500000 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch005.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch006.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch006.feather' --denom 1500000 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch006.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch007.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch007.feather' --denom 1426768 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch007.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch008.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch008.feather' --denom 1470707 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch008.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch009.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch009.feather' --denom 1485354 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch009.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch010.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch010.feather' --denom 1456061 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch010.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch011.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch011.feather' --denom 1470707 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch011.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch012.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch012.feather' --denom 1412121 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch012.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch013.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch013.feather' --denom 1426768 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch013.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch014.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch014.feather' --denom 1426768 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch014.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch015.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch015.feather' --denom 1441414 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch015.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch016.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch016.feather' --denom 1456061 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch016.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch017.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch017.feather' --denom 1441414 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch017.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch018.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch018.feather' --denom 1382828 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch018.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch019.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch019.feather' --denom 1485354 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch019.log 2>&1
python3 crossprod_solve.py --s1 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s1.feather' --s2 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_s2.feather' --post_eta 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_postEta_ch020.feather' --path_out 'datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch020.feather' --denom 1368182 --chunk_size 1000 --threshold_mb 2000 --solve_chunk_size 50 --verbose >> datasets/processed/model_fitting/Mod_Riv_Hab1/TEMP_Pred/LF_1_Test_etaPred_ch020.log 2>&1
»» Example LF_SLURM.slurm file
#!/bin/bash
#SBATCH --job-name=PP_LF
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --account=project_465001588
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=200G
#SBATCH --gpus-per-node=1
#SBATCH --time=01:00:00
#SBATCH --partition=small-g
#SBATCH --output=datasets/processed/model_fitting/TMP/%x-%A-%a.out
#SBATCH --error=datasets/processed/model_fitting/TMP/%x-%A-%a.out
#SBATCH --array=1-186
# Define directories
OutputDir="datasets/processed/model_fitting/TF_BatchFiles"
# Find all the split files and sort them explicitly
SplitFiles=($(find "$OutputDir" -type f -name "TF_Chunk_*.txt" | sort -V))
# Check if files were found
if [ ${#SplitFiles[@]} -eq 0 ]; then
echo "Error: No files matching TF_Chunk_*.txt found in $OutputDir"
exit 1
fi
# Ensure no more than `, NumFiles, ` files are processed
MaxFiles=186
if [ ${#SplitFiles[@]} -gt $MaxFiles ]; then
SplitFiles=("${SplitFiles[@]:0:$MaxFiles}")
echo "More than $MaxFiles files found, limiting to the first $MaxFiles files."
fi
# Get the index of the current task based on SLURM_ARRAY_TASK_ID
TaskIndex=$((SLURM_ARRAY_TASK_ID - 1))
# Validate TaskIndex
if [ $TaskIndex -ge ${#SplitFiles[@]} ] || [ $TaskIndex -lt 0 ]; then
echo "Error: TaskIndex $TaskIndex is out of range. Valid range: 0 to $((${#SplitFiles[@]} - 1))"
exit 1
fi
# Get the specific split file to process based on the job array task ID
SplitFile="${SplitFiles[$TaskIndex]}"
# Verify the selected split file
if [ -z "$SplitFile" ] || [ ! -f "$SplitFile" ]; then
echo "Error: File $SplitFile does not exist or is invalid."
exit 1
fi
# Processing file
echo "Processing file: $SplitFile"
# Run the selected split file
bash "$SplitFile"
echo End of program at `date`
It consolidates variance partitioning command files into a single
VP_Commands.txt and prepares a SLURM script
(VP_SLURM.slurm ) for variance partitioning computations.
»» Example VP_Commands.txt file
python3 VP_gemu.py --tr datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_Tr.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_Mu.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_Mu.log 2>&1
python3 VP_gemu.py --tr datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_Tr.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_Mu.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_Mu.log 2>&1
python3 VP_gemu.py --tr datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_Tr.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_Mu.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_Mu.log 2>&1
python3 VP_gemu.py --tr datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_Tr.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_Mu.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_Mu.log 2>&1
python3 VP_gemu.py --tr datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_Tr.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_Mu.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_Mu.log 2>&1
python3 VP_gemu.py --tr datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_Tr.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_Mu.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_Mu.log 2>&1
python3 VP_gemu.py --tr datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_Tr.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_Mu.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_Mu.log 2>&1
python3 VP_gemu.py --tr datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_Tr.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_Mu.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_Mu.log 2>&1
python3 VP_geta.py --tr datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_Tr.feather --x datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_X.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_A.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_A.log 2>&1
python3 VP_geta.py --tr datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_Tr.feather --x datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_X.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_A.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_A.log 2>&1
python3 VP_geta.py --tr datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_Tr.feather --x datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_X.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_A.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_A.log 2>&1
python3 VP_geta.py --tr datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_Tr.feather --x datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_X.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_A.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_A.log 2>&1
python3 VP_geta.py --tr datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_Tr.feather --x datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_X.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_A.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_A.log 2>&1
python3 VP_geta.py --tr datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_Tr.feather --x datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_X.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_A.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_A.log 2>&1
python3 VP_geta.py --tr datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_Tr.feather --x datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_X.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_A.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_A.log 2>&1
python3 VP_geta.py --tr datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_Tr.feather --x datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_X.feather --gamma datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_Gamma.feather --output datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_A.feather --ncores 3 --chunk_size 50 >> datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_A.log 2>&1
python3 VP_getf.py --x datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_X.feather --beta_dir datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP --output datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_F.feather --ncores 3 >> datasets/processed/model_fitting/Mod_Q_Hab1/TEMP_VP/VP_F.log 2>&1
python3 VP_getf.py --x datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_X.feather --beta_dir datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP --output datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_F.feather --ncores 3 >> datasets/processed/model_fitting/Mod_Q_Hab2/TEMP_VP/VP_F.log 2>&1
python3 VP_getf.py --x datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_X.feather --beta_dir datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP --output datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_F.feather --ncores 3 >> datasets/processed/model_fitting/Mod_Q_Hab3/TEMP_VP/VP_F.log 2>&1
python3 VP_getf.py --x datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_X.feather --beta_dir datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP --output datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_F.feather --ncores 3 >> datasets/processed/model_fitting/Mod_Q_Hab4a/TEMP_VP/VP_F.log 2>&1
python3 VP_getf.py --x datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_X.feather --beta_dir datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP --output datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_F.feather --ncores 3 >> datasets/processed/model_fitting/Mod_Q_Hab4b/TEMP_VP/VP_F.log 2>&1
python3 VP_getf.py --x datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_X.feather --beta_dir datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP --output datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_F.feather --ncores 3 >> datasets/processed/model_fitting/Mod_Q_Hab10/TEMP_VP/VP_F.log 2>&1
python3 VP_getf.py --x datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_X.feather --beta_dir datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP --output datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_F.feather --ncores 3 >> datasets/processed/model_fitting/Mod_Q_Hab12a/TEMP_VP/VP_F.log 2>&1
python3 VP_getf.py --x datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_X.feather --beta_dir datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP --output datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_F.feather --ncores 3 >> datasets/processed/model_fitting/Mod_Q_Hab12b/TEMP_VP/VP_F.log 2>&1
»» Example VP_SLURM.slurm file
#!/bin/bash
#SBATCH --job-name=VP_TF
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --account=project_465001588
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=200G
#SBATCH --gpus-per-node=1
#SBATCH --time=01:30:00
#SBATCH --partition=small-g
#SBATCH --output=datasets/processed/model_fitting/TF_postprocess/log/%x-%A-%a.out
#SBATCH --error=datasets/processed/model_fitting/TF_postprocess/log/%x-%A-%a.out
#SBATCH --array=1-24
# File containing commands to be executed
File=datasets/processed/model_fitting/VP_Commands.txt
# Load TensorFlow module and configure environment
ml use /appl/local/csc/modulefiles
ml tensorflow
export TF_CPP_MIN_LOG_LEVEL=3
export TF_ENABLE_ONEDNN_OPTS=0
# Verify GPU availability
python3 -c "import tensorflow as tf; print(\"Num GPUs Available:\", len(tf.config.list_physical_devices(\"GPU\")))"
# Run array job
head -n $SLURM_ARRAY_TASK_ID $File | tail -n 1 | bash
echo End of program at `date`
Step 2: GPU
Latent factor predictions and variance partitioning are computed on
GPUs. Batch jobs can be submitted using the sbatch
command:
sbatch datasets/processed/model_fitting/TF_postprocess/VP_SLURM.slurm
sbatch datasets/processed/model_fitting/TF_postprocess/LF_SLURM.slurm
Cross-validated models are fitted by submitting corresponding SLURM
commands (in preparation ):
source datasets/processed/model_fitting/HabX/Model_Fitting_CV/CV_Bash_Fit.slurm
Step 3: CPU
The Mod_Postprocess_2_CPU()
function advances the
post-processing pipeline for HMSC models on the CPU, automating the
following tasks:
Step 4: GPU
Predicting latent factors for cross-validated models on GPUs (in
preparation ).