Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
354 views
in Technique[技术] by (71.8m points)

r - Calculating differences of dates in hours between rows of a dataframe

I have the following dataframe (ts1):

                D1 Diff
1 20/11/2014 16:00 0.00
2 20/11/2014 17:00 0.01
3 20/11/2014 19:00 0.03

I would like to add a new column to ts1 that will be the difference in hours between successive rows D1 (dates) in hours.

The new ts1 should be:

                D1 Diff N
1 20/11/2014 16:00 0.00 
2 20/11/2014 17:00 0.01 1
3 20/11/2014 19:00 0.03 2

For calculating the difference in hours independently I use:

library(lubridate)
difftime(dmy_hm("29/12/2014 11:00"), dmy_hm("29/12/2014 9:00"), units="hours") 

I know that for calculating the difference between each row I need to transform the ts1 into matrix.

I use the following command:

> ts1$N<-difftime(dmy_hm(as.matrix(ts1$D1)), units="hours")

And I get:

Error in as.POSIXct(time2) : argument "time2" is missing, with no default
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Suppose ts1 is as shown in Note 2 at the end. Then create a POSIXct variable tt from D1, convert tt to numeric giving the number of seconds since the Epoch, divide that by 3600 to get the number of hours since the Epoch and take differences. No packages are used.

tt <- as.POSIXct(ts1$D1, format = "%d/%m/%Y %H:%M")
m <- transform(ts1, N = c(NA, diff(as.numeric(tt) / 3600)))

giving:

> m

                D1 Diff  N
1 20/11/2014 16:00 0.00 NA
2 20/11/2014 17:00 0.01  1
3 20/11/2014 19:00 0.03  2

Note 1: I assume you are looking for N so that you can fill in the empty hours. In that case you don't really need N. Also, it would be easier to deal with time series if you use a time series representation. First we convert ts1 to a zoo object, then we create a zero width zoo object with the datetimes that we need and finally we merge them:

library(zoo)
z <- read.zoo(ts1, tz = "", format = "%d/%m/%Y %H:%M")

z0 <- zoo(, seq(start(z), end(z), "hours"))
zz <- merge(z, z0)

giving:

> zz
2014-11-20 16:00:00 2014-11-20 17:00:00 2014-11-20 18:00:00 2014-11-20 19:00:00 
               0.00                0.01                  NA                0.03 

If you really did need a data frame back then:

DF <- fortify.zoo(zz)

Note 2: Input used in reproducible form is:

Lines <- "D1,Diff
1,20/11/2014 16:00,0.00
2,20/11/2014 17:00,0.01
3,20/11/2014 19:00,0.03"

ts1 <- read.csv(text = Lines, as.is = TRUE)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...