GitHub - guhjy/vimp: Nonparametric variable importance assessment

vimp: nonparametric variable importance assessment

Author: Brian Williamson

Introduction

In predictive modeling applications, it is often of interest to determine the relative contribution of subsets of features in explaining an outcome; this is often called variable importance. It is useful to consider variable importance as a function of the unknown, underlying data-generating mechanism rather than the specific predictive algorithm used to fit the data. This package provides functions that, given fitted values from predictive algorithms, compute nonparametric estimates of and variance-based variable importance, along with asymptotically valid confidence intervals for the true importance.

More detail may be found in our tech report.

This method works on low-dimensional and high-dimensional data.

Issues

If you encounter any bugs or have any specific feature requests, please file an issue.

R installation

You may install a stable release of vimp from CRAN via install.packages("vimp"). You may also install a stable release of vimp from GitHub via devtools by running the following code (you may replace v1.1.3 with the tag for the specific release you wish to install):

## install.packages("devtools") # only run this line if necessary
devtools::install_github(repo = "bdwilliamson/[email protected]")

You may install a development release of vimp from GitHub via devtools by running the following code:

## install.packages("devtools") # only run this line if necessary
devtools::install_github(repo = "bdwilliamson/vimp")

Example

This example shows how to use vimp in a simple setting with simulated data, using SuperLearner to estimate the conditional mean functions. For more examples and detailed explanation, please see the vignette.

## load required functions and libraries
library("SuperLearner")
library("vimp")
library("xgboost")
library("glmnet")

## -------------------------------------------------------------
## problem setup
## -------------------------------------------------------------
## set up the data
n <- 100
p <- 2
s <- 1 # desire importance for X_1
x <- as.data.frame(replicate(p, runif(n, -1, 1)))
y <- (x[,1])^2*(x[,1]+7/5) + (25/9)*(x[,2])^2 + rnorm(n, 0, 1) 

## -------------------------------------------------------------
## preliminary step: estimate the conditional means
## -------------------------------------------------------------
## set up the learner library, consisting of the mean, boosted trees,
## elastic net, and random forest
learner.lib <- c("SL.mean", "SL.xgboost", "SL.glmnet", "SL.randomForest")

## the full conditional mean
full_regression <- SuperLearner::SuperLearner(Y = y, X = x, family = gaussian(), SL.library = learner.lib)
full_fit <- full_regression$SL.predict

## the reduced conditional mean
reduced_regression <- SuperLearner::SuperLearner(Y = full_fit, X = x[, -s, drop = FALSE], family = gaussian(), SL.library = learner.lib)
reduced_fit <- reduced_regression$SL.predict

## -------------------------------------------------------------
## get variable importance!
## -------------------------------------------------------------
## get the variable importance estimate, SE, and CI
vimp <- vimp_regression(Y = y, f1 = full_fit, f2 = reduced_fit, indx = 1, run_regression = FALSE)

Name		Name	Last commit message	Last commit date
Latest commit History 252 Commits
R		R
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md
appveyor.yml		appveyor.yml
codecov.yml		codecov.yml
cran-comments.md		cran-comments.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

vimp: nonparametric variable importance assessment

Introduction

Issues

R installation

Example

About

Licenses found

Releases

Packages

Languages

License

Licenses found

guhjy/vimp

Folders and files

Latest commit

History

Repository files navigation

vimp: nonparametric variable importance assessment

Introduction

Issues

R installation

Example

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages