Skip to contents

This function is optimized for speed using parallel processing and optionally TensorFlow for matrix operations. This function is adapted from Hmsc::predictLatentFactor with equivalent results to the original function when predictMean = TRUE.

Usage

Predict_LF(
  unitsPred,
  modelunits,
  postEta,
  postAlpha,
  LF_rL,
  LF_NCores = 8L,
  Temp_Dir = "TEMP_Pred",
  LF_Temp_Cleanup = TRUE,
  Model_Name = NULL,
  UseTF = TRUE,
  TF_Environ = NULL,
  TF_use_single = FALSE,
  LF_OutFile = NULL,
  LF_Return = FALSE,
  LF_Check = FALSE,
  LF_Commands_Only = FALSE,
  solve_max_attempts = 5L,
  solve_chunk_size = 50L,
  Verbose = TRUE
)

Arguments

unitsPred

a factor vector with random level units for which predictions are to be made

modelunits

a factor vector with random level units that are conditioned on

postEta

Character string specifying the path for postEta; a list containing samples of random factors at conditioned units

postAlpha

a list containing samples of range (lengthscale) parameters for latent factors

LF_rL

a HmscRandomLevel-class object that describes the random level structure

LF_NCores

Integer specifying the number of cores to use for parallel processing. Defaults to 8.

Temp_Dir

Character string specifying the path for temporary storage of intermediate files.

LF_Temp_Cleanup

Logical indicating whether to delete temporary files in the Temp_Dir after finishing the LF predictions.

Model_Name

Character string used as a prefix for temporary file names. Defaults to NULL, in which case no prefix is used.

UseTF

Logical indicating whether to use TensorFlow for calculations. Defaults to TRUE.

TF_Environ

Character string specifying the path to the Python environment. Defaults to NULL. This argument is required if UseTF is TRUE.

TF_use_single

Logical indicating whether to use single precision for the TF calculations. Defaults to FALSE.

LF_OutFile

Character string specifying the path to save the outputs. If NULL (default), the predicted latent factors are not saved to a file. This should end with either *.qs2 or *.RData.

LF_Return

Logical. Indicates if the output should be returned. Defaults to FALSE. If LF_OutFile is NULL, this parameter cannot be set to FALSE because the function needs to return the result if it is not saved to a file.

LF_Check

Logical. If TRUE, the function checks if the output files are already created and valid. If FALSE, the function will only check if the files exist without checking their integrity. Default is FALSE.

LF_Commands_Only

logical. If TRUE, returns the command to run the Python script. Default is FALSE.

solve_max_attempts

numeric (Optional). Maximum number of attempts to run solve and crossprod functions. Default is 5.

solve_chunk_size

numeric. Chunk size for solve_and_multiply python function. Default is 50.

Verbose

Logical. If TRUE, detailed output is printed. Default is FALSE.

Details

The function is expected to be faster than the original function in the Hmsc package, especially when using TensorFlow for calculations and when working on parallel.

The main difference is that this function:

  • allow for parallel processing (LF_NCores argument);

  • it is possible to use TensorFlow (UseTF argument) to make matrix calculations faster, particularly when used on GPU. The following modules are needed: numpy, os, tensorflow, rdata, xarray, and pandas. To use TensorFlow, the argument TF_Environ should be set to the path of a Python environment with TensorFlow installed;

  • if UseTF is set to FALSE, the function uses R / CPP code in the calculations;

  • calculates D11 and D12 matrices only once and save them to disk and call them when needed.

Author

This script was adapted from the Hmsc::predictLatentFactor function in the Hmsc package.