Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
798 views
in Technique[技术] by (71.8m points)

database - How can I calculate the distance between latitude and longitude along rows of columns in R?

My df looks like this:

    bid        ts    latitude  longitude
1  827566 1999-10-07 42.40944 -88.17822
2  827566 2013-04-11 41.84740 -87.63126
3 1902966 2012-05-02 45.52607 -94.20649
4 1902966 2013-03-25 41.94083 -87.65852
5 3211972 2012-08-14 43.04786 -87.96618
6 3211972 2013-08-02 41.88258 -87.63760

I want to create a new df that calculates the difference in time and distance from each successive point. I would like to calculate down the rows grouped by bid's that are the same. I used the following for loop to accomplish this:

library(geosphere)
   lengthdata <- nrow(twopoint)
   twopointdata <- data.frame(matrix(ncol = 4, nrow =lengthdata))
   x <- c("bid", "time", "d", "dsq")
   colnames(twopointdata) <- x
   n <- numeric()
   n <- 1

   for (i in 1:lengthdata)
   {
     if (twopoint[i+1,1] == twopoint[i,1]) 
     {
       twopointdata[n,1] <- twopoint[i+1,1]
       twopointdata[n,2] <- as.numeric(twopoint[i+1,5]-twopoint[i,5])
       twopointdata[n,3] <- distm(c(twopoint[i+1,10], twopoint[i+1,9]), 
                              c(twopoint[i,10],twopoint[i,9]), fun = 
                                   distHaversine)
       twopointdata[n,4] <- twopoint[n,3]^2
       n <- n+1
     }

   }
   attach(twopointdata)
   head(twopointdata)

(some of the column numbers are off because I took out some rows to display more clearly)

My result looks like this:

      bid time    d          dsq
1  827566 4935  77159.8 5.677201e+11
2 1902966  327 660457.0 6.436004e+16
3 3211972  353 132494.8 3.540118e+12
4 3692174 4722 727359.6 6.394166e+16
5 4404655 4833 201644.7 1.092944e+13
6 6644203 4518 210485.9 6.721980e+16

It has the ids for each data point, time difference between each, distance calculated from long and lat, and the squared distance. PROBLEM: it's very slow and eventually i'll be doing this on a very large data set.

I was able to do this without a for loop successfully with the time difference using dplyr like this:

 library(dplyr)
 library(geosphere)
 latlongdata2 <- latlongdata 
 latlongdata2 %>%
  group_by(bid)%>%
  transmute(
    bid = bid,
    t = c(NA,diff(ts)))

I can't figure out how to do this with the latitude and longitude because unlike the ts values they are in two different columns. Anyone have any suggestions?

P.S. the overall aim of the project is to do a mean squared displacement analysis on the data.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think you're overcomplicating it a little. I wish geosphere::distHaversine had a slightly more intuitive calling method (similar to, say, diff), but it's not hard to work around it:

dat <- read.table(text = "  bid        ts    latitude  longitude
 827566 1999-10-07 42.40944 -88.17822
 827566 2013-04-11 41.84740 -87.63126
1902966 2012-05-02 45.52607 -94.20649
1902966 2013-03-25 41.94083 -87.65852
3211972 2012-08-14 43.04786 -87.96618
3211972 2013-08-02 41.88258 -87.63760", header = TRUE, stringsAsFactors = FALSE)
dat$ts <- as.Date(dat$ts)

library(dplyr)
library(geosphere)
group_by(dat, bid) %>%
  mutate(
    d = c(NA,
          distHaversine(cbind(longitude[-n()], latitude[-n()]),
                        cbind(longitude[  -1], latitude[  -1]))),
    dts = c(NA, diff(ts))
  ) %>%
  ungroup() %>%
  filter( ! is.na(d) )
# # A tibble: 3 × 6
#       bid         ts latitude longitude         d   dts
#     <int>     <date>    <dbl>     <dbl>     <dbl> <dbl>
# 1  827566 2013-04-11 41.84740 -87.63126  77159.35  4935
# 2 1902966 2013-03-25 41.94083 -87.65852 660457.41   327
# 3 3211972 2013-08-02 41.88258 -87.63760 132494.65   353

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...