I might have got it to work. As per the original code I was using, here is the plot:
#load libraries
library(MASS)
library(lime)
library(randomForest)
#create data
var_1<- rnorm(100,1,4)
var_2 <-rnorm(10,10,5)
var_3<- c("0","2", "4")
var_3 <- sample(var_3, 100, replace=TRUE, prob=c(0.3, 0.6, 0.1))
response<- c("1","0")
response <- sample(response, 100, replace=TRUE, prob=c(0.3, 0.7))
#put them into a data frame called "f"
f <- data.frame(var_1, var_2, var_3, response)
#declare var_3 and response_variable as factors
f$var_3 = as.factor(f$var_3)
f$response = as.factor(f$response)
# run random forest on all the data except the first observation
model<-randomForest(response ~., data = f , mtry=2, ntree=100)
model<-as_classifier(model, labels = NULL)
#run the "lime" procedure on the first observation
explainer <- lime(f[-1,], model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f[-1, ], explainer, n_labels = 1, n_features = 4)
#visualize the results - here is the error:
plot_features(explanation, ncol = 1)
I change the code (see below):
#load libraries
library(MASS)
library(lime)
library(randomForest)
#create data
var_1<- rnorm(100,1,4)
var_2 <-rnorm(10,10,5)
var_3<- c("0","2", "4")
var_3 <- sample(var_3, 100, replace=TRUE, prob=c(0.3, 0.6, 0.1))
response<- c("1","0")
response <- sample(response, 100, replace=TRUE, prob=c(0.3, 0.7))
#put them into a data frame called "f"
f <- data.frame(var_1, var_2, var_3, response)
#declare var_3 and response_variable as factors
f$var_3 = as.factor(f$var_3)
f$response = as.factor(f$response)
# run random forest on all the data except the first observation
model<-randomForest(response ~., data = f , mtry=2, ntree=100)
model<-as_classifier(model, labels = NULL)
#run the "lime" procedure on the first observation
explainer <- lime(f[-1,], model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f[-1, ], explainer, n_labels = 1, n_features = 4)
#visualize the results - here is the error:
plot_features(explanation, case =1:4, ncol = 1)
I don't understand what changed - but at least the graphics now show up. Suppose I am interested in only the first observation. I am still confused whether these lines should be:
explainer <- lime(f[-1,], model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f[-1, ], explainer, n_labels = 1, n_features = 4)
or
explainer <- lime(f, model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f, explainer, n_labels = 1, n_features = 4)
I am also not sure what is the difference between "probability" and "explanation fit". I assume "probability" is the probability generated by the random forest model, and "explanation fit" measures the "explanatory power" of the LIME model.
(If someone knows about this, could they please comment below? thanks)