Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
352 views
in Technique[技术] by (71.8m points)

solrj - Too many boolean clauses exception in solr

I am facing these problem while using OR , logical operator in framing query. I dont want to increase the maxBooleanClause value. Is there any other option than this. My OR range can go upto like 2 millions.I would rather want that if range of maxBooleanClause is exceeded than solr splits up the query, & finally merge all the subqueries. Is something of these sort possible? Or if any of you can suggest some better technique to do this.

I want to plot a graph where user provide some range of dates for e.g. between 2013-03-01 to 2013-06-01 gives all the visitors visiting the app. Here i want to make a query which is OR of all unique id's.For e.g.

      uniqueId:(1001 OR 1003 OR 1009 OR ........ OR 102467)

Help is appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Solr imposes a maxBooleanClause precisely because this is the kind of thing that is outside of its sweet spot. Ultimately, if you need millions of searches, then you will need to do your own distribution and aggregation outside of Solr.

I am going to go out on a limb and guess that these clauses are graph related, which is the most common place I see these kinds of queries. In that case, it may be possible for you to stay somewhat inside Solr's strengths here.

Sometimes it makes sense to invert the logic of your filter, and instead of passing in a large set of values to filter by, index those values onto the documents you are searching so you can pass a single value later.

For example, say you have an index of people. And say you want to search for people who are friends with some specific person. You could generate the list of IDs of all their friends in order to filter your search. But then you'll have a similar problem to what you're seeing here: lots and lots of OR clauses.

Alternatively, you can index each person's list of friends into Solr. Now you'll have a field with thousands of values in it, but your query filter will have only one value: the ID of the person whose network you are filtering the search by.

This plays more toward Solr's strengths as far as the mechanics of searching are concerned. However, there is a cost. You'll need to manage the denormalization yourself, and probably be making a lot of updates to your documents, or suffering some latency in updates to your graph.

If that proves too onerous, you may need to consider a different technology better optimized for graph traversal.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...