GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  ELASTICSEARCH::QUERYPARSER (3)

.ds Aq ’

NAME

ElasticSearch::QueryParser - Check or filter query strings

CONTENTS

DESCRIPTION

Passing an illegal query string to ElasticSearch, the request will fail. When using a query string from an external source, eg the keywords field from a web search form, it is important to filter it to avoid these failures.

You may also want to allow or disallow certain query string features, eg the ability to search on a particular field.

The ElasticSearch::QueryParser takes care of this for you.

See <http://lucene.apache.org/java/3_0_3/queryparsersyntax.html> for more information about the Lucene Query String syntax, and <http://www.elasticsearch.org/guide/reference/query-dsl/query-string-query.html#Syntax_Extension> for custom ElasticSearch extensions to the query string syntax.

SYNOPSIS



    use ElasticSearch;
    my $es = ElasticSearch->new(servers=>127.0.0.1:9200);
    my $qp = $es->query_parser(%opts);

    my $filtered_query_string = $qp->filter($unchecked_query_string)

    my $results = $es->search( query=> {
                      query_string=>{ query => $filtered_query_string }
                  });



For example:



    my $qs = foo NOT AND -bar - baz * foo* secret_field:SIKRIT "quote;

    print $qp->filter($qs);
    # foo AND -bar baz foo* "quote"



METHODS

new()



    my $qp = ElasticSearch::QueryParser->new(%opts);
    my $qp = $es->query_parser(%opts);



Creates a new ElasticSearch::QueryParser object, and sets the passed in options (see OPTIONS).

filter()



    $filtered_query_string = $qp->filter($unchecked_query_string, %opts)



Checks a passed in query string and returns a filtered version which is suitable to pass to ElasticSearch.

Note: filter() can still return an empty string, which is not considered a valid query string, so you should still check for that before passing to ElasticSearch.

If any %opts are passed in to filter(), these are added to the default %opts as set by new(), and apply only for the current run.

filter() does not promise to parse the query string in exactly the same way as Lucene, just to clear it up so that it won’t throw an error when passed to ElasticSearch.

check()



    $filtered_query_string = $qp->check($unchecked_query_string, %opts)



Checks a passed in query string and throws an error if it is not valid. This is useful for debugging your own query strings.

If any %opts are passed in to check(), these are added to the default %opts as set by new(), and apply only for the current run.

OPTIONS

You can set various options to control how your query strings are filtered.

The defaults (if no options are passed in) are:



    escape_reserved => 0
    fields          => 0
    boost           => 1
    allow_bool      => 1
    allow_boost     => 1
    allow_fuzzy     => 1
    allow_slop      => 1
    allow_ranges    => 0
    wildcard_prefix => 1



Any options passed in to new() are merged with these defaults. These options apply for the life of the QueryParser instance.

Any options passed in to filter() or check() are merged with the options set in new() and apply only for the current run.

For instance:



    $qp = ElasticSearch::QueryParser->new(allow_fuzzy => 0);

    $qs = "foo~0.5 bar^2 foo:baz";

    print $qp->filter($qs, allow_fuzzy => 1, allow_boost => 0);
    # foo~0.5 bar baz

    print $qp->filter($qs, fields => 1 );
    # foo bar^2 foo:baz



    escape_reserved

Reserved characters must be escaped to be used in the query string. By default, filter() will remove these characters. Set escape_reserved to true if you want them to be escaped instead.

Reserved characters: + - && || ! ( ) { } [ ] ^ " ~ * ? : \

    fields

Normally, you don’t want to allow your users to specify which fields to search. By default, filter() removes any field prefixes, eg:



    $qp->filter(foo:bar secret_field:SIKRIT)
    # bar SIKRIT



You can set fields to 1 to allow all fields, or pass in a hashref with a list of approved fieldnames, eg:



    $qp->filter(foo:bar secret_field:SIKRIT, fields => 1);
    # foo:bar secret_field:SIKRIT

    $qp->filter(foo:bar secret_field:SIKRIT, fields => {foo => 1});
    # foo:bar SIKRIT



ElasticSearch extends the standard Lucene syntax to include:



    _exists_:fieldname
  and
    _missing_:fieldname



The fields option applies to these fieldnames as well.

    allow_bool

Query strings can use boolean operators like:



    foo AND bar NOT baz OR ! (foo && bar)



By default, boolean operators are allowed. Set allow_bool to false to disable them.

Note: This doesn’t affect the + or - operators, which are always allowed. eg:



    +apple -crab



    allow_boost

Boost allows you to give a more importance to a particular word, group of words or phrase, eg:



    foo^2  (bar baz)^3  "this exact phrase"^5



By default, boost is enabled. Setting allow_boost to false would convert the above example to:



    foo (bar baz) "this exact phrase"



    allow_fuzzy

Lucene supports fuzzy searches based on the Levenshtein Distance, eg:



    supercalifragilisticexpialidocious~0.5



To disable these, set allow_fuzzy to false.

    allow_slop

While a phrase search (eg "this exact phrase") looks for the exact phrase, in the same order, you can use phrase slop to find all the words in the phrase, in any order, within a certain number of words, eg:



    For the phrase: "The quick brown fox jumped over the lazy dog."

    Query string:               Matches:
    "quick brown"               Yes
    "brown quick"               No
    "quick fox"                 No
    "brown quick"~2             Yes  # within 2 words of each other
    "fox dog"~6                 Yes  # within 6 words of each other



To disable this phrase slop, set allow_slop to false

    allow_ranges

Lucene can accept ranges, eg:



    date:[2001 TO 2010]   name:[alan TO john]



To enable these, set allow_ranges to true.

    wildcard_prefix

Lucene can accept wildcard searches such as:



    jo*n  smith?



Lucene takes these wildcards and expands the search to include all matching terms, eg jo*n could be expanded to jon, john, jonathan etc

This can result in a huge number of terms, so it is advisable to require that the first $min characters of the word are not wildcards.

By default, the wildcard_prefix requires that at least the first character is not a wildcard, ie * is not acceptable, but s* is.

You can change the minimum length of the non-wildcard prefix by setting wildcard_prefix, eg:



    $qp->filter("foo* foobar*", wildcard_prefix=>4)
    # "foo foobar*"



BUGS

This is a new module, so it is likely that there will be bugs, and the list of options and how filter() cleans up the query string may well change.

If you have any suggestions for improvements, or find any bugs, please report them to <http://github.com/clintongormley/ElasticSearch.pm/issues>.

Patches welcome!

LICENSE AND COPYRIGHT

Copyright 2010 Clinton Gormley.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.

Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 ELASTICSEARCH::QUERYPARSER (3) 2013-09-24

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.