Search::OpenFTS::Index - Provides functions for indexing




my $fts=Search::OpenFTS::Index->new( DBI );

my $fts=Search::OpenFTS::Index->new( DBI, prefix );

my $fts=Search::OpenFTS::Index->init(
dict=>[DICT1, DICT2, ...],
map=>’{IDTYPELEXEM1=>[IDDICT1, ...], ...}’,
ignore_id_index=>IDTYPELEXEM1 [IDTYPELEXEM2 [...]],
ignore_headline=>IDTYPELEXEM1 [IDTYPELEXEM2 [...]],
prefix=>PREFIX );

This is the initialization function. It is called only once, at the creation of a new search index, to create the configuration and indexing tables.
txttid The table where the documents are stored together with its primary key (e.g. messages.msg_id)
dict List of available dictionaries. Dictionaries should support three methods: lemms, is_stoplexem, drop and init. init is used for the initialization of the dictionary. lemms returns an array of lexems for a given word and is_stoplexem answers whether the given lexeme corresponds to a stop word or not. drop is used for clearing dictionaries tables (if any) while dropping OpenFTS instance. Methods is_stoplexem, drop and init are optional.
parser The full name of the parser in use. Parser should have the same interface as Search::OpenFTS::Parser module.
map A mapping from types of lexemes to dictionaries. This is helpful for optimizing the search engine and it is also helpful for indexing multi-languages or exotic-text documents.
tsvector_field The field name that holds the text index of integers for each document. This field must have tsvector type( from contrib/tsearch )
ignore_id_index Type IDs of lexemes to ignore while indexing documents.
ignore_id_headline Type IDs of lexemes to ignore while constructing headlines of the search results.
prefix If more than one content tables require indexing and searching functionality the user can pass a special parameter named prefix which is a character value from a-z. The given prefix is used, as a naming convention, to create different instances of the configuration and indexing table.

To specify dictionary which requires parameters (snowball stemmer, for example), use following syntax:

# example how to use snowball stemmer
          { mod=>Search::OpenFTS::Dict::Snowball, param=>{lang=>"english"} },


index( $txt_id, [ $FH | $text | $reftext ] );
index( $txt_id, [ $FH | $text | $reftext ], $title ); Used for indexing text.
delete ( $txt_id ) Deletes all records of the given identifier.
create_index(1); Creates indices for fast searching, non-zero option - verbose mode
drop_index() Removes all indices on tables correspoding current instance of OpenFTS. Any error are ignored, only warn. This method is opposite for create_index. This is usefull for bulk uploading.
drop() Removes all tables correspoding current instance of OpenFTS. Any error are ignored, only warn.
start_index( $tid ) Opening a session for indexing


my $idx = Search::OpenFTS::Index->new( ... );

my $idx_chunk = $idx->start_index( ID );

foreach my $f ( glob <*.html> ) {

        $idx_chunk->index_chunk( IO::File->new( $f ) );



<B>fix_permissionsB>($user) Grant r/o access on indexes and search table to user $user or to PUBLIC if $user doesn’t specified.

Return TRUE on success or error message if fails. Please, check return value explicitly for ’1’ !

Calls fix_permissions for each dictionary if it can.

index_chunk( [FH|REFTXT|TXT], direction=>[1|-1] )
index_chunk( [FH|REFTXT|TXT], wclass=>[A|B|C|D] )
index_chunk( FH, direction=>[1|-1], offset=>$offset, length=>$length );
index_chunk( FH, wclass=>[A|B|C|D], offset=>$offset, length=>$length ); Adds a part to an index. Option ’direction’ is to store compatibility with old version of OpenFTS. wclass option has defaults ’D’.
flush Dump in base of an index



    The OpenFTS Primer          (  see doc/ subdirectory )

    The Crash-course to OpenFTS ( in examples/ subdirectory )

    perldoc Search::OpenFTS::Search

    perldoc Search::OpenFTS::Parser

    perldoc Search::OpenFTS::Dict::PorterEng

    perldoc Search::OpenFTS::Dict::Snowball

    perldoc Search::OpenFTS::Dict::UnknownDict

    perldoc Search::OpenFTS::Morph::ISpell

