GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Plucene::Simple(3) User Contributed Perl Documentation Plucene::Simple(3)

Plucene::Simple - An interface to Plucene

        use Plucene::Simple;

        # create an index
        my $plucy = Plucene::Simple->open($index_path);

        # add to the index
        $plucy->add(
                $id1 => { $field => $term1 }, 
                $id2 => { $field => $term2 }, 
        );

        # or ...
        $plucy->index_document($id => $data);

        # search an existing index
        my $plucy = Plucene::Simple->open($index_path);
        my @results = $plucy->search($search_string);

        # optimize the index
        $plucy->optimize;

        # remove something from the index
        $plucy->delete_document($id);

        # is something in the index?
        if ($plucy->indexed($id) { ... }

This provides a simple interface to Plucene. Plucene is large and multi-featured, and it expected that users will subclass it, and tie all the pieces together to suit their own needs. Plucene::Simple is, therefore, just one way to use Plucene. It's not expected that it will do exactly what *you* want, but you can always use it as an example of how to build your own interface.

You make a new Plucene::Simple object like so:

        my $plucy = Plucene::Simple->open($index_path);

If this index doesn't exist, then it will be created for you, otherwise you will be adding to an exisiting one.

Then you can add your documents to the index:

Every document must be indexed with a unique key (which will be returned from searches).

A document can be made up of many fields, which can be added as a hashref:

        $plucy->add($key, \%data);

        $plucy->add(
                chap1  => { 
                        title => "Moby-Dick", 
                        author => "Herman Melville", 
                        text => "Call me Ishmael ..." 
                },
                chap2  => { 
                        title => "Boo-Hoo", 
                        author => "Lydia Lee", 
                        text => "...",
                }
        );

Alternatively, if you do not want to index lots of metadata, but rather just simple text, you can use the index_document() method.

        $plucy->index_document($key, $data);
        $plucy->index_document(chap1 => 'Call me Ishmael ...');

        $plucy->delete_document($id);

        $plucy->optimize;

Plucene is set-up to perform insertions quickly. After a bunch of inserts it is good to optimize() the index for better search speed.

        my @ids = $plucy->search('ishmael'); 
          # ("chap1", ...)

This will return the IDs of each document matching the search term.

If you have indexed your documents with fields, you can also search with the field name as a prefix:

        my @ids = $plucy->search("author:lee"); 
                # ("chap2" ...)

        my @results = $plucy->search($search_string);

This will search the index with the given query, and return a list of document ids.

Searches can be much more powerful than this - see Plucene for further details.

        my @results = $lucy->search_during($search_string, $date1, $date2);
        my @results = $lucy->search_during("to:Fred", "2001-01-01" => "2003-12-31");

If your documents were given an ISO 'date' field when indexing, search_during() will restrict the results to all documents between the specified dates. Any document without a 'date' field will be ignored.

        if ($plucy->indexed($id) { ... }

This returns true if there is a document with the given ID in the index.

Copyright (C) 2003-2004 Kasei Limited
2022-04-09 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.