ZF-3743: ShortWords token filter not working with utf-8 charset


When using the ShortWords token filter with the UTF-8 Analyser, it fails to skip tokens containing UTF-8 characters.

For example, with a length of 2, the token "à" (common in french) is not skipped because strlen returns 2.

The solution would be to make a ShortWordsUtf8 that uses iconv_strlen instead of strlen.


Working ShortWordsUtf8 using iconv_strlen instead of strlen (based on ShortWord.php from release-1.5.2)

