Wrapper to generate multi-response predictive models.

mrIMLpredicts(
  X,
  Y,
  Model,
  balance_data = "no",
  mode = "regression",
  transformY = "log",
  dummy = FALSE,
  tune_grid_size = 10,
  k = 10,
  seed = sample.int(1e+08, 1)
)

Arguments

X

A dataframe represents predictor or feature data.

Y

A dataframe is response variable data (species, OTUs, SNPs etc).

Model

1 A list can be any model from the tidy model package. See examples.

balance_data

A character 'up', 'down' or 'no'.

mode

character'classification' or 'regression' i.e., is the generative model a regression or classification?

dummy

A logical 'TRUE or FALSE'.

tune_grid_size

A numeric sets the grid size for hyperparamter tuning. Larger grid sizes increase computational time.

k

A numeric sets the number of folds in the 10-fold cross-validation. 10 is the default.

Details

This function produces yhats that used in all subsequent functions. This function fits separate classification/regression models for each response variable in a data set. Rows in X (features) have the same id (host/site/population) as Y. Class imbalance can be a real issue for classification analyses. Class imbalance can be addressed for each response variable using 'up' (upsampling using ROSE bootstrapping), 'down' (downsampling) or 'no' (no balancing of classes).