Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
175 views
in Technique[技术] by (71.8m points)

How to read csv file in R where some values contain the percent symbol (%)

Is there a clean/automatic way to convert CSV values formatted with as percents (with trailing % symbol) in R?

Here is some example data:

actual,simulated,percent error
2.1496,8.6066,-300%
0.9170,8.0266,-775%
7.9406,0.2152,97%
4.9637,3.5237,29%

Which can be read using:

junk = read.csv("Example.csv")

But all of the % columns are read as strings and converted to factors:

> str(junk)
 'data.frame':  4 obs. of  3 variables:
 $ actual       : num  2.15 0.917 7.941 4.964
 $ simulated    : num  8.607 8.027 0.215 3.524
 $ percent.error: Factor w/ 4 levels "-300%","-775%",..: 1 2 4 3

but I would like them to be numeric values.

Is there an additional parameter for read.csv? Is there a way to easily post process the needed columns to convert to numeric values? Other solutions?

Note: of course in this example I could simply recompute the values, but in my real application with a larger data file this is not practical.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There is no "percentage" type in R. So you need to do some post-processing:

DF <- read.table(text="actual,simulated,percent error
2.1496,8.6066,-300%
0.9170,8.0266,-775%
7.9406,0.2152,97%
4.9637,3.5237,29%", sep=",", header=TRUE)

DF[,3] <- as.numeric(gsub("%", "",DF[,3]))/100

#  actual simulated percent.error
#1 2.1496    8.6066         -3.00
#2 0.9170    8.0266         -7.75
#3 7.9406    0.2152          0.97
#4 4.9637    3.5237          0.29

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...