Lucene search custimization

Posted by Community Admin on 04-Aug-2018 20:05

Lucene search custimization

All Replies

Posted by Community Admin on 02-Jun-2014 00:00

I have implemented a partial match solution from this older blog post:

http://www.sitefinity.com/blogs/atanas-valchev's-blog/2013/11/01/implementing-sitefinity-partial-match-search

The problem I have is that I no longer am able to get results for words that are hyphenated due to how the wildcard is searching.

Examples:

CompanyName - Searching for "company" will return results for CompanyName or Company Name.  The query looks like this (searchQuery=company*)

Company-Name - Searching for "company-name" returns no results. The query looks like this (searchQuery=company-name*)

I played around and decided to add a trailing space before the wildcard. Now I can find the hyphenated spelling, but no longer partial matches for certain scenarios.

CompanyName - Searching for "company" yields no results.  (searchQuery=company  *)

Company-Name - Searching for "company-name" returns results. (searchQuery=company-name  *)

I know this is due to how Lucene is using wildcards, but I am trying to find out if there is a way to combine multiple search parameters in Sitefinity's usage of the Lucene engine to cover both of these scenario's.

I am trying to find the closest way to simulate doing a SQL LIKE search (%Name%).  

 

Note:  I also tried to combine them by using OR like the following, but it only returns the hyphenated scenario and no longer find the partial match.

var newQuery = string.Format("0* OR 0 *", searchQuery);
nameValues.Set("searchQuery", newQuery);

I have also tried fuzzy searching with "~", but I have not been able to come up with one query that hits all my requirements.

Posted by Community Admin on 03-Jun-2014 00:00

I may have a solution to my problem.  It looks like I was over-thinking and simply allowing it to search either with or without a wildcard will work.

var newQuery = string.Format("0 OR 0*", searchQuery);

I need to work on saving the initial user input to override what shows up in the textbox so they are not confused by seeing something like "CompanyName OR CompanyName" since the wildcard is being masked by an space (granted I can comment that out), but so far this looks promising.

I would have thought that a search on "Company-Name*" should have worked, but the more I am reading about Lucene I wonder if it has something to do with stemming?

 

This thread is closed