Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
584 views
in Technique[技术] by (71.8m points)

dataframe - R - data frame - convert to sparse matrix

I have a data frame which is mostly zeros (sparse data frame?) something similar to

name,factor_1,factor_2,factor_3
ABC,1,0,0
DEF,0,1,0
GHI,0,0,1

The actual data is about 90,000 rows with 10,000 features. Can I convert this to a sparse matrix? I am expecting to gain time and space efficiencies by utilizing a sparse matrix instead of a data frame.

Any help would be appreciated

Update #1: Here is some code to generate the data frame. Thanks Richard for providing this

x <- structure(list(name = structure(1:3, .Label = c("ABC", "DEF", "GHI"),
                    class = "factor"), 
               factor_1 = c(1L, 0L, 0L), 
               factor_2 = c(0L,1L, 0L), 
               factor_3 = c(0L, 0L, 1L)), 
               .Names = c("name", "factor_1","factor_2", "factor_3"), 
               class = "data.frame",
               row.names = c(NA,-3L))
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It might be a bit more memory efficient (but slower) to avoid copying all the data into a dense matrix:

y <- Reduce(cbind2, lapply(x[,-1], Matrix, sparse = TRUE))
rownames(y) <- x[,1]

#3 x 3 sparse Matrix of class "dgCMatrix"
#         
#ABC 1 . .
#DEF . 1 .
#GHI . . 1

If you have sufficient memory you should use Richard's answer, i.e., turn your data.frame into a dense matrix and than use Matrix.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...