ZF-7738: Adding multi-token awareness to rewrite() in Query_Preprozessing_... and Query_Phrase (patch)

Description

class Zend_Search_Lucene_Search_Query_Preprocessing_Term had a couple of @todo lines (and thrown exceptions) regarding handling of term text when it was parsed as multiple tokens, either due to a stemming or synonym analyzer, or due to space interpreted characters in the term (like underscore)

The same with Zend_Search_Lucene_Search_Query_Preprocessing_Fuzzy

This patch adds awareness to both types of multi-tokens, and rewrites such queries (even when including wildcards) as Zend_Search_Lucene_Search_Query_Phrase

(instead of Query_MultiToken as before)

thereby handling $token->getPositionIncrement() to find out wether a token is to be handled as an alternative (0) or part of a "phrase" (>0) and setting the correct position within the phrase.

This patch works standalone for searching, however in order to have highlighting working correctly, it needs the Temp-Index highlighting patch ( ZF-7736 ).

regards

Corvus Corax

PS: this extends the patches attached to ZF-7688, supposedly fixing that bug

Comments

@todo

the test in Zend_Search_Lucene_Search_Query_Preprocessing_Term wether wildcard term alternatives have been found for all positions in the to be created Phrase query doesn't take on-purpose gaps into account and assumes there need to be tokens present for ALL positions.

Basically that means, the case "$token->getPositionIncrement() >= 2" is not taken care of and may need additional work to give usefull results.

It may not be very relevant in practice, but, as they say, never say "never" ;)

this issue should be much more easy to fix in Zend_Search_Lucene_Search_Query_Preprocessing_Fuzzy.

but I have no time doing that now, I am off for a holiday trip for the next two weeks.

Bulk change of all issues last updated before 1st January 2010 as "Won't Fix".

Feel free to re-open and provide a patch if you want to fix this issue.

Bug has been closed by default in "Bulk change". However patch is already attached to bug, what has been missing is a maintainer

I tried to apply the patch, but it failed, so I applied it manually. However it causes the unit tests to fail.

Are you able to fix the patch?