r - How does one aggregate and summarize data quickly?

Question

Welcome To Ask or Share your Answers For Others

r - How does one aggregate and summarize data quickly?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - How does one aggregate and summarize data quickly?

I have a dataset whose headers look like so:

PID Time Site Rep Count

I want sum the Count by Rep for each PID x Time x Site combo

on the resulting data.frame, I want to get the mean value of Count for PID x Time x Site combo.

Current function is as follows:

dummy <- function (data)
{
A<-aggregate(Count~PID+Time+Site+Rep,data=data,function(x){sum(na.omit(x))})
B<-aggregate(Count~PID+Time+Site,data=A,mean)
return (B)
}

This is painfully slow (original data.frame is 510000 20). Is there a way to speed this up with plyr?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:58:53+0000

You should look at the package data.table for faster aggregation operations on large data frames. For your problem, the solution would look like:

library(data.table)
data_t = data.table(data_tab)
ans = data_t[,list(A = sum(count), B = mean(count)), by = 'PID,Time,Site']

Categories

r - How does one aggregate and summarize data quickly?

r - How does one aggregate and summarize data quickly?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags