Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
691 views
in Technique[技术] by (71.8m points)

search - Limit ElasticSearch aggregation to top n query results

I have a set of 2.8 million docs with sets of tags that I'm querying with ElasticSearch, but many of these docs can be grouped together by one ID. I want to query my data using the tags, and then aggregate them by the ID that repeats. Often my search results have tens of thousands of documents, but I only want to aggregate the top 100 results of the search. How can I constrain an aggregation to only the top 100 results from a query?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Sampler Aggregation :

A filtering aggregation used to limit any sub aggregations' processing to a sample of the top-scoring documents.

"aggs": {
     "bestDocs": {
         "sampler": {
          //    "field": "<FIELD>", <-- optional, Controls diversity using a field
              "shard_size":100
         },
         "aggs": {
              "bestBuckets": {
                 "terms": {
                      "field": "id"
                  }
               }
         }
      }
  }

This query will limit the sub aggregation to top 100 docs from the result and then bucket them by ID.

Optionally, you can use the field or script and max_docs_per_value settings to control the maximum number of documents collected on any one shard which share a common value.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...