Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 772 Vote(s) - 3.47 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Solr wildcard query with whitespace

#1
I have a wildcard query that looks something like:

q=location:los a*

I'd like it to match "los angeles" and "los altos". A query like:

q=los*

Works just fine, but as soon as I add whitespace I get no results. How I can use whitespace in my wildcard queries?
Reply

#2
Without seeing your config, I would say use a KeywordTokenizerFactory as you probably tokenize on whitespace now.
Reply

#3
The query (assuming you have whitespace tokenizer):
q=location:los a*
means that you search document with word "los" and a word that starts with "a"

Solr (as much that I know) cannot determine if one word (or term) appear before another.
Reply

#4
Might I suggest the solr prefix query plugin if you are only using it for wildcards on the suffix as we were

[To see links please register here]


example usage

[To see links please register here]


would match "Bob Smith" or "Bob Smit" but not convert into a check of ("Bob" OR "Smi*") as would happen if you used the first solution you might consider along the lines of `q=name:Bob%20Smi*`

Hopefully this is of some help to you or someone else looking for a simple solution because I was banging my head against a wall for hours before I found this!
Reply

#5
I've recently come across this problem myself, and it seems that all you need to do is escape the space in your query. Your original query would be interpreted by Solr as something like this:

location:los id:a*
(assuming "id" is your default search field)

However, if you were to write your query as:

location:los\ a*
Then it would end up being parsed as:

location:los a*
And the above should yield the results that you desire (assuming your data is properly indexed).

**Tip:** Figuring all this out is simple. Just add `&debugQuery=on` to the end of the url you use when submitting your query to see how it was parsed by Solr.
Reply

#6
I think you should use the config like this

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.PatternReplaceFilterFactory" pattern="(\s+)" replacement="" replace="all" />
</analyzer>
</fieldType>
and you have to handle your input keyword for search as remove whitespace
Reply

#7
For me worked

<fieldtype name="text_like" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="1000"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.LowerCaseTokenizerFactory"/>
</analyzer>
</fieldtype>

and query `field:*some\ phrase*` (in java literal one needs to escape \ as \\\\).
Reply

#8
Solution for your problem using complex query parser:

q={!complexphrase inOrder=true}location:"los a*"

To know more about Complex phrase query parser, checkout this link!

[To see links please register here]

Reply

#9
I had the same problem in my project. When ever I was search for a word along with the whitespace I was not geting the result. So I replaced the whitespace with a hyphen "-" while indexing and querying. Below is the schema.xml snipet which I used to do so:


<fieldType name="text_ci" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="2" maxGramSize="250"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
<filter class="solr.PatternReplaceFilterFactory"
pattern="([/\s+])" replacement="-" replace="all"
/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.EdgeNGramTokenizerFactory" minGramSize="2" maxGramSize="250"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
<filter class="solr.PatternReplaceFilterFactory"
pattern="([/\s+])" replacement="-" replace="all"
/>
</analyzer>
</fieldType>
Reply

#10
Used this

q=location:los/ a*


instead of

q=location:los a*
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through