This function is a wrapper for the feature_selection function

wrapper_feat_select(
  X,
  y,
  params_glmnet = NULL,
  params_xgboost = NULL,
  params_ranger = NULL,
  xgb_sort = NULL,
  CV_folds = 5,
  stratified_regr = FALSE,
  scale_coefs_glmnet = FALSE,
  cores_glmnet = NULL,
  params_features = NULL,
  verbose = FALSE
)

Arguments

X

a sparse Matrix, a matrix or a data frame

y

a vector of length representing the response variable

params_glmnet

a list of parameters for the glmnet model

params_xgboost

a list of parameters for the xgboost model

params_ranger

a list of parameters for the ranger model

xgb_sort

sort the xgboost features by "Gain", "Cover" or "Frequency" ( defaults to "Frequency")

CV_folds

a number specifying the number of folds for cross validation

stratified_regr

a boolean determining if the folds in regression should be stratified

scale_coefs_glmnet

if TRUE, less important coefficients will be smaller than the more important ones (ranking/plotting by magnitude possible)

cores_glmnet

an integer determining the number of cores to register in glmnet

params_features

is a list of parameters for the wrapper function

verbose

outputs info

Value

a list containing the important features of each method. If union in the params_feature list is enabled, then it also returns the average importance of all methods.

Details

This function returns the importance of the methods specified and if union in the params_feature list is TRUE then it also returns the average importance of all methods. Furthermore the user can limit the number of features using the keep_number_feat parameter of the params_feature list.

Examples

if (FALSE) { #........... # regression #........... data(iris) X = iris[, -5] y = X[, 1] X = X[, -1] params_glmnet = list(alpha = 1, family = 'gaussian', nfolds = 3, parallel = TRUE) params_xgboost = list( params = list("objective" = "reg:linear", "bst:eta" = 0.01, "subsample" = 0.65, "max_depth" = 5, "colsample_bytree" = 0.65, "nthread" = 2), nrounds = 100, print.every.n = 50, verbose = 0, maximize = FALSE) params_ranger = list(probability = FALSE, num.trees = 100, verbose = TRUE, classification = FALSE, mtry = 3, min.node.size = 10, num.threads = 2, importance = 'permutation') params_features = list(keep_number_feat = NULL, union = TRUE) feat = wrapper_feat_select(X, y, params_glmnet = params_glmnet, params_xgboost = params_xgboost, params_ranger = params_ranger, xgb_sort = NULL, CV_folds = 10, stratified_regr = FALSE, cores_glmnet = 2, params_features = params_features) }