Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

mongodb - Mongo Sort by Count of Matches in Array

Lets say my test data is

db.multiArr.insert({"ID" : "fruit1","Keys" : ["apple", "orange", "banana"]})
db.multiArr.insert({"ID" : "fruit2","Keys" : ["apple", "carrot", "banana"]})

to get individual fruit like carrot i do

db.multiArr.find({'Keys':{$in:['carrot']}})

when i do an or query for orange and banana, i see both the records fruit1 and then fruit2

db.multiArr.find({ $or: [{'Keys':{$in:['carrot']}}, {'Keys':{$in:['banana']}}]})

Result of the output should be fruit2 and then fruit1, because fruit2 has both carrot and banana

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

To actually answer this first, you need to "calculate" the number of matches to the given condition in order to "sort" the results to return with the preference to the most matches on top.

For this you need the aggregation framework, which is what you use for "calculation" and "manipulation" of data in MongoDB:

db.multiArr.aggregate([
  { "$match": { "Keys": { "$in": [ "carrot", "banana" ] } } },
  { "$project": {
    "ID": 1,
    "Keys": 1,
    "order": {
      "$size": {
        "$setIntersection": [ ["carrot", "banana"], "$Keys" ]
      }
    }
  }},
  { "$sort": { "order": -1 } }
])

On an MongoDB older than version 3, then you can do the longer form:

db.multiArr.aggregate([
  { "$match": { "Keys": { "$in": [ "carrot", "banana" ] } } },
  { "$unwind": "$Keys" },
  { "$group": {
    "_id": "$_id",
    "ID": { "$first": "$ID" },
    "Keys": { "$push": "$Keys" },
    "order": {
      "$sum": {
        { "$cond": [
          { "$or": [
           { "$eq": [ "$Keys", "carrot" ] },
           { "$eq": [ "$Keys", "banana" ] }
         ]},
         1,
         0
        ]}
      }
    }
  }},
  { "$sort": { "order": -1 } }
])

In either case the function here is to first match the possible documents to the conditions by providing a "list" of arguments with $in. Once the results are obtained you want to "count" the number of matching elements in the array to the "list" of possible values provided.

In the modern form the $setIntersection operator compares the two "lists" returning a new array that only contains the "unique" matching members. Since we want to know how many matches that was, we simply return the $size of that list.

In older versions, you pull apart the document array with $unwind in order to perform operations on it since older versions lacked the newer operators that worked with arrays without alteration. The process then looks at each value individually and if either expression in $or matches the possible values then the $cond ternary returns a value of 1 to the $sum accumulator, otherwise 0. The net result is the same "count of matches" as shown for the modern version.

The final thing is simply to $sort the results based on the "count of matches" that was returned so the most matches is on "top". This is is "descending order" and therefore you supply the -1 to indicate that.


Addendum concerning $in and arrays

You are misunderstanding a couple of things about MongoDB queries for starters. The $in operator is actually intended for a "list" of arguments like this:

{ "Keys": { "$in": [ "carrot", "banana" ] } }

Which is essentially the shorthand way of saying "Match either 'carrot' or 'banana' in the property 'Keys'". And could even be written in long form like this:

{ "$or": [{ "Keys": "carrot" }, { "Keys": "banana" }] }

Which really should lead you to if it were a "singular" match condition, then you simply supply the value to match to the property:

{ "Keys": "carrot" }

So that should cover the misconception that you use $in to match a property that is an array within a document. Rather the "reverse" case is the intended usage where instead you supply a "list of arguments" to match a given property, be that property an array or just a single value.

The MongoDB query engine makes no distinction between a single value or an array of values in an equality or similar operation.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...