machine learning - R: "argument is of length 0" (empty plot)

Question

Welcome To Ask or Share your Answers For Others

machine learning - R: "argument is of length 0" (empty plot)

posted Feb 19, 2021 in Technique[技术] by 深蓝 (71.8m points)

machine learning - R: "argument is of length 0" (empty plot)

I am using the R programming language. I am trying to follow this tutorial over here: https://cran.r-project.org/web/packages/lime/vignettes/Understanding_lime.html

I tried to create my own data to replicate this tutorial with:

#load libraries
library(MASS)
library(lime)
library(randomForest)

#create data
var_1<- rnorm(100,1,4)
var_2 <-rnorm(10,10,5)
var_3<- c("0","2", "4")
var_3 <- sample(var_3, 100, replace=TRUE, prob=c(0.3, 0.6, 0.1))

response<- c("1","0")
response <- sample(response, 100, replace=TRUE, prob=c(0.3, 0.7))

#put them into a data frame called "f"
f <- data.frame(var_1, var_2, var_3, response)

#declare var_3 and response_variable as factors
f$var_3 = as.factor(f$var_3)
f$response = as.factor(f$response)

# run random forest on all the data except the first observation
model<-randomForest(response ~., data = f[-1,] , mtry=2, ntree=100)
model<-as_classifier(model, labels = NULL)

#run the "lime" procedure on the first observation
explainer <- lime(f[-1,], model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f[-1, ], explainer, n_labels = 1, n_features = 4)
    
#visualize the results - here is the error:
plot_features(explanation, ncol = 1)

Error in if (nrow(explanation) == 0) stop("No explanations to plot", call. = FALSE) : 
  argument is of length zero

Can someone please show me what I am doing wrong? Is it because this procedure is not meant to be run on a single observation?

Thanks

UPDATE: If I change this line of code:

model<-randomForest(response ~., data = f[-1,] , mtry=2, ntree=100)

to

model<-randomForest(response ~., data = f , mtry=2, ntree=100)

the code now seems to run (this is not a big problem, I can just write f = f[-1,] and f_new = f[1,] prior to running this step), but the visual plot is not fully showing up. Is this a problem with my graphics console? (note: the tutorial from the website works and runs perfectly)

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252    LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=C                    LC_TIME=English_Canada.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] randomForest_4.6-14 lime_0.5.1          MASS_7.3-53        

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5           lubridate_1.7.9      lattice_0.20-41      class_7.3-17         assertthat_0.2.1    
 [6] glmnet_4.0-2         digest_0.6.25        ipred_0.9-9          foreach_1.5.1        mime_0.9            
[11] R6_2.4.1             plyr_1.8.6           stats4_4.0.2         ggplot2_3.3.2        pillar_1.4.6        
[16] rlang_0.4.7          caret_6.0-86         rstudioapi_0.11      data.table_1.12.8    rpart_4.1-15        
[21] Matrix_1.2-18        shinythemes_1.1.2    labeling_0.3         splines_4.0.2        gower_0.2.2         
[26] stringr_1.4.0        htmlwidgets_1.5.2    munsell_0.5.0        tinytex_0.26         shiny_1.5.0         
[31] compiler_4.0.2       httpuv_1.5.4         xfun_0.15            pkgconfig_2.0.3      shape_1.4.5         
[36] htmltools_0.5.0      nnet_7.3-14          tidyselect_1.1.0     tibble_3.0.3         prodlim_2019.11.13  
[41] codetools_0.2-16     crayon_1.3.4         dplyr_1.0.2          withr_2.3.0          later_1.1.0.1       
[46] recipes_0.1.13       ModelMetrics_1.2.2.2 grid_4.0.2           nlme_3.1-149         xtable_1.8-4        
[51] gtable_0.3.0         lifecycle_0.2.0      magrittr_1.5         pROC_1.16.2          scales_1.1.1        
[56] stringi_1.4.6        farver_2.0.3         reshape2_1.4.4       promises_1.1.1       timeDate_3043.102   
[61] ellipsis_0.3.1       generics_0.0.2       vctrs_0.3.2          xgboost_1.1.1.1      lava_1.6.8          
[66] iterators_1.0.13     tools_4.0.2          glue_1.4.1           purrr_0.3.4          fastmap_1.0.1

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-02-19T04:05:09+0000

I might have got it to work. As per the original code I was using, here is the plot:

#load libraries
library(MASS)
library(lime)
library(randomForest)

#create data
var_1<- rnorm(100,1,4)
var_2 <-rnorm(10,10,5)
var_3<- c("0","2", "4")
var_3 <- sample(var_3, 100, replace=TRUE, prob=c(0.3, 0.6, 0.1))

response<- c("1","0")
response <- sample(response, 100, replace=TRUE, prob=c(0.3, 0.7))

#put them into a data frame called "f"
f <- data.frame(var_1, var_2, var_3, response)

#declare var_3 and response_variable as factors
f$var_3 = as.factor(f$var_3)
f$response = as.factor(f$response)

# run random forest on all the data except the first observation
model<-randomForest(response ~., data = f , mtry=2, ntree=100)
model<-as_classifier(model, labels = NULL)

#run the "lime" procedure on the first observation
explainer <- lime(f[-1,], model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f[-1, ], explainer, n_labels = 1, n_features = 4)

#visualize the results - here is the error:
plot_features(explanation, ncol = 1)

I change the code (see below):

#load libraries
library(MASS)
library(lime)
library(randomForest)

#create data
var_1<- rnorm(100,1,4)
var_2 <-rnorm(10,10,5)
var_3<- c("0","2", "4")
var_3 <- sample(var_3, 100, replace=TRUE, prob=c(0.3, 0.6, 0.1))

response<- c("1","0")
response <- sample(response, 100, replace=TRUE, prob=c(0.3, 0.7))

#put them into a data frame called "f"
f <- data.frame(var_1, var_2, var_3, response)

#declare var_3 and response_variable as factors
f$var_3 = as.factor(f$var_3)
f$response = as.factor(f$response)

# run random forest on all the data except the first observation
model<-randomForest(response ~., data = f , mtry=2, ntree=100)
model<-as_classifier(model, labels = NULL)

#run the "lime" procedure on the first observation
explainer <- lime(f[-1,], model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f[-1, ], explainer, n_labels = 1, n_features = 4)

#visualize the results - here is the error:
plot_features(explanation, case =1:4, ncol = 1)

I don't understand what changed - but at least the graphics now show up. Suppose I am interested in only the first observation. I am still confused whether these lines should be:

explainer <- lime(f[-1,], model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f[-1, ], explainer, n_labels = 1, n_features = 4)

or

explainer <- lime(f, model, bin_continuous = TRUE, quantile_bins = FALSE)
explanation <- explain(f, explainer, n_labels = 1, n_features = 4)

I am also not sure what is the difference between "probability" and "explanation fit". I assume "probability" is the probability generated by the random forest model, and "explanation fit" measures the "explanatory power" of the LIME model.

(If someone knows about this, could they please comment below? thanks)

Categories

machine learning - R: "argument is of length 0" (empty plot)

machine learning - R: "argument is of length 0" (empty plot)

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags