Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
358 views
in Technique[技术] by (71.8m points)

mongoid - MongoDB Aggregation: Compute Running Totals from sum of previous rows

Sample Documents:

{
 _id: ObjectId('4f442120eb03305789000000'),
 time: ISODate("2013-10-10T20:55:36Z"),
 value:1
},
{
 _id: ObjectId('4f442120eb03305789000001'),
 time: ISODate("2013-10-10T28:43:16Z"),
 value:2
},
{
 _id: ObjectId('4f442120eb03305789000002'),
 time: ISODate("2013-10-11T27:12:66Z"),
 value:3
},
{
 _id: ObjectId('4f442120eb03305789000003'),
 time: ISODate("2013-10-11T10:15:38Z"),
 value:4
},
{
 _id: ObjectId('4f442120eb03305789000004'),
 time: ISODate("2013-10-12T26:15:38Z"),
 value:5
}

It's easy to get the aggregated results that is grouped by date. But what I want is to query results that returns a running total of the aggregation, like:

{
 time: "2013-10-10"
 total: 3,
 runningTotal: 3
},
{
 time: "2013-10-11"
 total: 7,
 runningTotal: 10 
},
{
 time: "2013-10-12"
 total: 5,
 runningTotal: 15
}

Is this possible with the MongoDB Aggregation?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This does what you need. I have normalised the times in the data so they group together (You could do something like this). The idea is to $group and push the time's and total's into separate arrays. Then $unwind the time array, and you have made a copy of the totals array for each time document. You can then calculated the runningTotal (or something like the rolling average) from the array containing all the data for different times. The 'index' generated by $unwind is the array index for the total corresponding to that time. It is important to $sort before $unwinding since this ensures the arrays are in the correct order.

db.temp.aggregate(
    [
        {
            '$group': {
                '_id': '$time',
                'total': { '$sum': '$value' }
            }
        },
        {
            '$sort': {
                 '_id': 1
            }
        },
        {
            '$group': {
                '_id': 0,
                'time': { '$push': '$_id' },
                'totals': { '$push': '$total' }
            }
        },
        {
            '$unwind': {
                'path' : '$time',
                'includeArrayIndex' : 'index'
            }
        },
        {
            '$project': {
                '_id': 0,
                'time': { '$dateToString': { 'format': '%Y-%m-%d', 'date': '$time' }  },
                'total': { '$arrayElemAt': [ '$totals', '$index' ] },
                'runningTotal': { '$sum': { '$slice': [ '$totals', { '$add': [ '$index', 1 ] } ] } },
            }
        },
    ]
);

I have used something similar on a collection with ~80 000 documents, aggregating to 63 results. I am not sure how well it will work on larger collections, but I have found that performing transformations(projections, array manipulations) on aggregated data does not seem to have a large performance cost once the data is reduced to a manageable size.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...