Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
931 views
in Technique[技术] by (71.8m points)

grouping - How do I group data in a list by date and average the associated data values in R?

I want to group the data below by date (on a daily basis) and the get the mean of each group.

The dataset created below is 3-dimensional array where i= Time (in days), j= Latitude and k=Longitude. This dataset is 4 years in length (1461 days) and has the attribute 'Dates' to denote each of the days/dates. I want to mean the data in 'Data' so that I end up with one mean value for 1st of January, 2nd of January, etc.

#First create the example dataset
tmintest=array(1:100, c(420,189,1461))

#create the list
Variable <- list(varName="rr")
Data = tmintest
xyCoords <- list(x = seq(-40.37,64.37,length.out=420), y = seq(25.37,72.37,length.out=189))
Dates <- list(start = seq(as.Date("2012-01-01"), as.Date("2015-12-31"), by="days"), end=seq(as.Date("2012-01-01"), as.Date("2015-12-31"), by="days"))
All <- list(Variable = Variable,Data=aperm(Data), xyCoords=xyCoords,Dates=Dates)
#Make sure the dates are characters (as in the original dataset I'm, working with)
All$Dates$start=as.character(All$Dates$start)
All$Dates$end=as.character(All$Dates$end)

I have looked at using aggregate:

aggregate(All$Data,by=list(All$Dates), FUN = "mean")

but I got the error:

Error in aggregate.data.frame(as.data.frame(x), ...) : 
  arguments must have same length

I tried to use group_by:

group_by(All$Dates)

but was returned this error:

Error in UseMethod("group_by_") : 
  no applicable method for 'group_by_' applied to an object of class "list"

What functions can I use to group the data by day and mean the newly created groups in a list in R?

EDIT: I need the resulting output to be of the size 365 x 189 x 420, where 1:365 are days of the year and 189 x 420 are the latitude/longitude.

So, I want to use all the 1st of January's in the All$Dates attribute to index/group the associated (All$Data) grids of size 189 x 420 (there will be four of them as it is four years of data) and then get the mean of these four grids/arrays. So, in this example, four January firsts, will be averaged to produce a grid of size 189 x 420. This will be carried out for every day of the year, to produce the final 365 x 189 x 420 dataset. Does that clarify what I am trying to do?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This isn't fast, but it does produce the desired output, I think.

library(lubridate)
date <- glue::glue("{month(ymd(All$Dates$start))}-{mday(ymd(All$Dates$start))}")
undate <- unique(date)
out <- array(dim=c(length(undate), 189, 420))
for(i in 1:length(undate)){
    w <- which(date == undate[i])
    out[i,,] <- apply(All$Data[w,,, drop=FALSE], c(2,3), mean)
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...