Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add boruta filter #163

Merged
merged 10 commits into from
Apr 9, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Imports:
paradox,
R6
Suggests:
Boruta,
care,
caret,
carSurv,
Expand All @@ -55,6 +56,7 @@ Collate:
'mlr_filters.R'
'FilterAUC.R'
'FilterAnova.R'
'FilterBoruta.R'
'FilterCMIM.R'
'FilterCarScore.R'
'FilterCarSurvScore.R'
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ S3method(as.data.table,Filter)
export(Filter)
export(FilterAUC)
export(FilterAnova)
export(FilterBoruta)
export(FilterCMIM)
export(FilterCarScore)
export(FilterCarSurvScore)
Expand Down
89 changes: 89 additions & 0 deletions R/FilterBoruta.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#' @title Burota Filter
#'
#' @name mlr_filters_boruta
#'
#' @description
#' Filter using the Boruta algorithm for feature selection.
#' If `keep = "tentative"`, confirmed and tentative features are returned.
#' Note that there is no ordering in the selected features.
#' Selected features get a score of 1, deselected features get a score of 0.
#' The order of selected features is random.
#' In combination with \CRANpkg{mlr3pipelines}, only the filter criterion `cutoff` makes sense.
#'
#' @references
#' `r format_bib("kursa_2010")`
#'
#' @family Filter
#' @include Filter.R
#' @template seealso_filter
#' @export
#' @examples
#' \donttest{
#' if (requireNamespace("Boruta")) {
#' task = mlr3::tsk("sonar")
#' filter = flt("boruta")
#' filter$calculate(task)
#' as.data.table(filter)
#' }
#' }

FilterBoruta = R6Class("FilterBoruta",
inherit = Filter,

public = list(

#' @description
#' Creates a new instance of this [R6][R6::R6Class] class.
initialize = function() {

param_set = ps(
pValue = p_dbl(default = 0.01),
mcAdj = p_lgl(default = TRUE),
maxRuns = p_int(lower = 1, default = 100),
doTrace = p_int(lower = 0, upper = 4, default = 0),
holdHistory = p_lgl(default = TRUE),
getImp = p_uty(),
keep = p_fct(levels = c("confirmed", "tentative"), default = "confirmed")
)

param_set$set_values(keep = "confirmed")

super$initialize(
id = "boruta",
task_types = c("regr", "classif"),
param_set = param_set,
packages = "Boruta",
feature_types = c("integer", "numeric"),
label = "Burota",
man = "mlr3filters::mlr_filters_boruta"
)
}
),

private = list(
.calculate = function(task, nfeat) {
pv = self$param_set$values
data = task$data()
target = task$target_names
features = task$feature_names
formula = formulate(target, features)
keep = pv$keep
pv$keep = NULL

res = invoke(Boruta::Boruta, formula = formula, data = data, .args = pv)

selected_features = if (keep == "confirmed") {
be-marc marked this conversation as resolved.
Show resolved Hide resolved
Boruta::getSelectedAttributes(res)
} else {
Boruta::getSelectedAttributes(res, withTentative = TRUE)
}

score = named_vector(features, init = 0)
replace(score, names(score) %in% selected_features, 1)
be-marc marked this conversation as resolved.
Show resolved Hide resolved
}
)
)


#' @include mlr_filters.R
mlr_filters$add("boruta", FilterBoruta)
11 changes: 10 additions & 1 deletion R/bibentries.R
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,14 @@ bibentries = c(
author = "Andrea Bommert and Thomas Welchowski and Matthias Schmid and J\u00f6rg Rahnenf\u00fchrer",
title = "Benchmark of filter methods for feature selection in high-dimensional gene expression survival data",
journal = "Briefings in Bioinformatics"
)
),

kursa_2010 = bibentry("article",
title = "Feature Selection with the Boruta Package",
volume = "36",
number = "11",
journal = "Journal of Statistical Software",
author = "Miron B. Kursa and Witold R. Rudnicki",
year = "2010",
pages = "1-13")
)
1 change: 1 addition & 0 deletions man/Filter.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_anova.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_auc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

110 changes: 110 additions & 0 deletions man/mlr_filters_boruta.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_carscore.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_carsurvscore.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_cmim.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_correlation.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_disr.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_find_correlation.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_importance.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_information_gain.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_jmi.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_jmim.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_kruskal_test.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_mim.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_mrmr.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_njmim.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/mlr_filters_performance.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading