Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
319 views
in Technique[技术] by (71.8m points)

lucene.net - Lucene query fails with mixed MUST/MUST_NOT

Given a document with this text, indexed in a field named Content:

The dish ran away with the spoon.

The following query fails to match that document:

+Content:dish +(-Content:xyz)   <-- no results!

I want the query to be treated as must include "dish", must not include "xyz". It's the "must not" part that is failing.

I know the +- combination looks funny but syntactically it should be correct, especially considering that the following variations all work:

+Content:dish +(-Content:xyz +Content:spoon)   <-- this works
+Content:dish -Content:xyz                     <-- this works

So why doesn't +(-Content:xyz) work? Is that by design, or a bug, or am I just missing something? I'm using Lucene.Net but I assume regular Lucene behaves the same.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Lucene doesn't start with a full view of everything, like a SQL database. Lucene starts with no documents matched, and finds things based on the clauses searched on. This is why:

-Content:xyz

On it's own doesn't really work. It knows not to bring in content:xyz, but hasn't been given any documents to match. The same is true of your query, because it's placed in a subquery.

-Content:xyz is evaluated first, which gets no docs on it's own. So then you have, effectively

+Content:dish +(no documents)

It's useful to think of - as an AND NOT rather than simply a NOT (though don't take that to imply the +/- and AND/OR/NOT syntax necessarily map to each other directly).

If you want to be able to execute a lonely negative query like that, you need to bring in all documents first. The MatchAllDocsQuery is the best way to accomplish that, something like:

BooleanQuery query = new BooleanQuery();
query.add(new BooleanClause(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD));
query.add(new BooleanClause(new TermQuery(new Term("Content","xyz")), BooleanClause.Occur.MUST_NOT));

Would be the equivalent of a SQL style query with only a negation for a WHERE clause.

Of course, this isn't really necessary in the case you've listed since:

+Content:dish -Content:xyz

Is perfectly adequate.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...