Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
854 views
in Technique[技术] by (71.8m points)

mongodb - Using stored JavaScript functions in the Aggregation pipeline, MapReduce or runCommand

Is there a way to use a user-defined function saved as db.system.js.save(...) in pipeline or mapreduce?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Any function you save to system.js is available for usage by "JavaScript" processing statements such as the $where operator and mapReduce and can be referenced by the _id value is was asssigned.

db.system.js.save({ 
   "_id": "squareThis", 
   "value": function(a) { return a*a } 
})

And some data inserted to "sample" collection:

{ "_id" : ObjectId("55aafd2bacbed38e06f9eccf"), "a" : 1 }
{ "_id" : ObjectId("55aafea6acbed38e06f9ecd0"), "a" : 2 }
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }

Then:

db.sample.mapReduce(
    function() {
       emit(null, squareThis(this.a));
    },
    function(key,values) {
        return Array.sum(values);
    },
    { "out": { "inline": 1 } }
 );

Gives:

   "results" : [
            {
                    "_id" : null,
                    "value" : 14
            }
    ],

Or with $where:

db.sample.find(function() { return squareThis(this.a) == 9 })
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }

But in "neither" case can you use globals such as the database db reference or other functions. Both $where and mapReduce documentation contain information of the limits of what you can do here. So if you thought you were going to do something like "look up data in another collection", then you can forget it because it is "Not Allowed".

Every MongoDB command action is actually a call to a "runCommand" action "under the hood" anyway. But unless what that command is actually doing is "calling a JavaScript processing engine" then the usage becomes irrelevant. There are only a few commands anyway that do this, being mapReduce, group or eval, and of course the find operations with $where.


The aggregation framework does not use JavaScript in any way at all. You might be mistaking just as others have done a statement like this, which does not do what you think it does:

db.sample.aggregate([
    { "$match": {
        "a": { "$in": db.sample.distinct("a") }
    }}
])

So that is "not running inside" the aggregation pipeline, but rather the "result" of that .distinct() call is "evaluated" before the pipeline is sent to the server. Much as with an external variable is done anyway:

var items = [1,2,3];
db.sample.aggregate([
    { "$match": {
        "a": { "$in": items }
    }}
])

Both essentially send to the server in the same way:

db.sample.aggregate([
    { "$match": {
        "a": { "$in": [1,2,3] }
    }}
])

So it is "not possible" to "call" any JavaScript function in the aggregation pipeline, nor is there really any point is "passing in" results in general from something saved in system.js. The "code" needs to be "loaded to the client" and only a JavaScript engine can actually do anything with it.

With the aggregation framework, all of the "operators" available are actually natively coded functions as opposed to the "free form" JavaScript interpretation provided for mapReduce. So instead of writing "JavaScript", you use the operators themselves:

db.sample.aggregate([
    { "$group": {
        "_id": null,
        "sqared": { "$sum": {
           "$multiply": [ "$a", "$a" ]
        }}
    }}
])

{ "_id" : null, "sqared" : 14 }

So there are limitations on what you can do with functions saved in system.js, and the chances are that what you want to do is either:

  • Not allowed, such as accessing data from another collection
  • Not really required as the logic is generally self contained anyway
  • Or probably better implemented in client logic or other different form anyway

Just about the only practical use I can really think of is that you have a number of "mapReduce" operations that cannot be done any other way and you have various "shared" functions that you would rather just store on the server than maintain within every mapReduce function call.

But then again, the 90% reason for mapReduce over the aggregation framework is usually that the "document structure" of the collections has been poorly chosen and the JavaScript functionality is "required" to traverse the document for search and analysis.

So you can use it under the allowed constraints, but in most cases you probably should not be using this at all, but fixing the other issues that caused you to believe you needed this feature in the first place.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...