GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Index(3) User Contributed Perl Documentation Index(3)

Search::OpenFTS::Index - Provides functions for indexing

my $fts=Search::OpenFTS::Index->new( DBI );

my $fts=Search::OpenFTS::Index->new( DBI, prefix );

my $fts=Search::OpenFTS::Index->init( dbi=>DBI, txttid=>NAME_TXT_ID, dict=>[DICT1, DICT2, ...], parser=>PARSER, map=>'{IDTYPELEXEM1=>[IDDICT1, ...], ...}', tsvector_field=>FIELD_NAME, ignore_id_index=>"IDTYPELEXEM1 [IDTYPELEXEM2 [...]]", ignore_headline=>"IDTYPELEXEM1 [IDTYPELEXEM2 [...]]", prefix=>PREFIX );

This is the initialization function. It is called only once, at the creation of a new search index, to create the configuration and indexing tables.

txttid
The table where the documents are stored together with its primary key (e.g. messages.msg_id)
dict
List of available dictionaries. Dictionaries should support three methods: lemms, is_stoplexem, drop and init. init is used for the initialization of the dictionary. lemms returns an array of lexems for a given word and is_stoplexem answers whether the given lexeme corresponds to a stop word or not. drop is used for clearing dictionaries tables (if any) while dropping OpenFTS instance. Methods is_stoplexem, drop and init are optional.
parser
The full name of the parser in use. Parser should have the same interface as Search::OpenFTS::Parser module.
map
A mapping from types of lexemes to dictionaries. This is helpful for optimizing the search engine and it is also helpful for indexing multi-languages or exotic-text documents.
tsvector_field
The field name that holds the text index of integers for each document. This field must have tsvector type( from contrib/tsearch )
ignore_id_index
Type IDs of lexemes to ignore while indexing documents.
ignore_id_headline
Type IDs of lexemes to ignore while constructing headlines of the search results.
prefix
If more than one content tables require indexing and searching functionality the user can pass a special parameter named prefix which is a character value from a-z. The given prefix is used, as a naming convention, to create different instances of the configuration and indexing table.

To specify dictionary which requires parameters (snowball stemmer, for example), use following syntax:

    dict=>[ 
# example how to use snowball stemmer
          { mod=>'Search::OpenFTS::Dict::Snowball', param=>'{lang=>"english"}' },
          'Search::OpenFTS::Dict::UnknownDict',
          ]
    

index( $txt_id, [ $FH | $text | $reftext ] );
index( $txt_id, [ $FH | $text | $reftext ], $title );
Used for indexing text.
delete ( $txt_id )
Deletes all records of the given identifier.
create_index
create_index(1);
Creates indices for fast searching, non-zero option - verbose mode
drop_index()
Removes all indices on tables correspoding current instance of OpenFTS. Any error are ignored, only warn. This method is opposite for create_index. This is usefull for bulk uploading.
drop()
Removes all tables correspoding current instance of OpenFTS. Any error are ignored, only warn.
start_index( $tid )
Opening a session for indexing

Use:

my $idx = Search::OpenFTS::Index->new( ... );

my $idx_chunk = $idx->start_index( ID );

foreach my $f ( glob <*.html> ) {

        $idx_chunk->index_chunk( IO::File->new( $f ) );
    

}

$idx_chunk->flush;

fix_permissions($user)
Grant r/o access on indexes and search table to user $user or to PUBLIC if $user doesn't specified.

Return TRUE on success or error message if fails. Please, check return value explicitly for '1' !

Calls fix_permissions for each dictionary if it can.

index_chunk( [FH|REFTXT|TXT], direction=>[1|-1] )
index_chunk( [FH|REFTXT|TXT], wclass=>[A|B|C|D] )
index_chunk( FH, direction=>[1|-1], offset=>$offset, length=>$length );
index_chunk( FH, wclass=>[A|B|C|D], offset=>$offset, length=>$length );
Adds a part to an index. Option 'direction' is to store compatibility with old version of OpenFTS. wclass option has defaults 'D'.
flush
Dump in base of an index

    The OpenFTS Primer          (  see doc/ subdirectory )

    The Crash-course to OpenFTS ( in examples/ subdirectory )

    perldoc Search::OpenFTS::Search

    perldoc Search::OpenFTS::Parser

    perldoc Search::OpenFTS::Dict::PorterEng

    perldoc Search::OpenFTS::Dict::Snowball

    perldoc Search::OpenFTS::Dict::UnknownDict

    perldoc Search::OpenFTS::Morph::ISpell
2004-01-26 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.