Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
488 views
in Technique[技术] by (71.8m points)

mongodb - is map/reduce appropriate for finding the median and mode of a set of values for many records?

I have a set of objects in Mongodb that each have a set of values embedded in them, e.g.:

[1.22, 12.87, 1.24, 1.24, 9.87, 1.24, 87.65] // ... up to about 150 values

Is a map/reduce the best solution for finding the median (average) and mode (most common value) in the embedded arrays? The reason that I ask is that the map and the reduce both have to return the same (structurally) set of values. It looks like in my case I want to take in a set of values (the array) and return a set of two values (median, mode).

If not, what's the best way to approach this? I want it to run in a rake task, if that's relevant. It'd be an overnight data crunching kind of thing.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There's a key question here regarding the expected output. It's not 100% clear from your question which one you want.

Do you want (A):

{ _id: "document1", value: { mode: 1.0, median: 10.0 } }
{ _id: "document2", value: { mode: 5.0, median: 150.0 } }
... one for each document

... or do you want (B), the mode and median across all the combination of all arrays.

  • If the answer is (A), then Map/Reduce will work.
  • If the answer is (B), then Map/Reduce will probably not work.

If you plan to do (A), please read the M/R documentation carefully and understand the limitations. While option (A) can be a Map/Reduce, it can also just be a big for loop with an upsert on the "summary" collection or even back into the original collection. This may be even more efficient.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...