Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
277 views
in Technique[技术] by (71.8m points)

barplot failure in R 3.1.0. read.csv converting what should be numerics to factors

I have a little problem with the bar plot function of R 3.1.0. (it works fine in older versions).

nd_p_a<- read.csv("nd_p_a.csv")
barplot(nd_p_a$y, col="blue", names.arg=nd_p_a$x, xlab="k", ylab="P(k)")

has worked without any warnings or errors. But i version 3.1.0 i got an error:

Error in barplot.default(nd_p_a$y, col = "blue", names.arg = nd_p_a2$x,  : 
  'height' must be a vector or a matrix

So, why did this do not work in this version? And how can i convert a factor to a vector? I tried as.numeric() and so on, but with no proper result.

The CSV File contains data like this:

"x","y"
1.0,48.947791826110596
2.0,6.317211620667564
3.0,14.982593438237588
4.0,3.4443873302013475
5.0,9.760934831763135
6.0,1.7191829918211519
7.0,3.9200958456693455
8.0,1.0765813450714172
9.0,2.290369697396343
10.0,0.6342337460169456
11.0,1.1210994624619959
12.0,0.5291701034830391

As wished more informations:

sessionInfo()

3.0.3

R version 3.0.3 (2014-03-06)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base  

3.1.0

R version 3.1.0 beta (2014-03-28 r65330)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.1.0

str(nd_p_a)

3.0.3

'data.frame':   1449 obs. of  2 variables:
 $ x: num  1 2 3 4 5 6 7 8 9 10 ...
 $ y: num  48.95 6.32 14.98 3.44 9.76 ...

3.1.0

'data.frame':   1449 obs. of  2 variables:
 $ x: num  1 2 3 4 5 6 7 8 9 10 ...
 $ y: Factor w/ 221 levels "0.0010183159621912567",..: 194 201 171 184 220 173 187 167 178 166 ...
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There seems to be an issue with the new version (3.1.0) of type.convert() which gets called by read.table() which gets called by read.csv() in R. The most recent version of type.convert() assumes that the representation in your file is more accurate than R's internal numeric storage format (double-precision floating point values) and thus casts it to a FACTOR. This behavior appears very surprising to a bunch of people, so I would bet it will go away, or there will be a parameter that will be able to be passed to type.convert() through the chain. It seems sufficiently painful for people (including myself) that rely on the old-standing behavior of the automatic field type detection algorithm.

This question should be cross-linked somewhere upstream to something like "Why doesn't read.csv() work reliably with floating point values anymore?"

http://r.789695.n4.nabble.com/type-convert-and-doubles-td4688616.html


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...