Manual Reference Pages - TEXT::CONTEXT::EITHERSIDE (3)
Text::Context::EitherSide - Get n words either side of search keywords
my $text = "The quick brown fox jumped over the lazy dog";
my $context = Text::Context::EitherSide->new($text);
$context->as_string("fox") # "... quick brown fox jumped over ..."
# "... quick brown fox jumped over the ..."
my $context = Text::Context::EitherSide->new($text, context => 1);
# 1 word on either side
$context->as_string("fox", "jumped", "dog");
# "... brown fox jumped over ... lazy dog",
Or, if you dont believe in all this OO rubbish:
use Text::Context::EitherSide qw(get_context);
get_context(1, $text, "fox", "jumped", "dog")
# "... brown fox jumped over ... lazy dog"
Suppose you have a large piece of text - typically, say, a web page or a
mail message. And now suppose youve done some kind of full-text search
on that text for a bunch of keywords, and you want to display the
context in which you found the keywords inside the body of the text.
A simple-minded way to do that would be just to get the two words either
side of each keyword. But hey, dont be too simple minded, because
youve got to make sure that the list doesnt overlap. If you have
the quick brown fox jumped over the lazy dog
and you extract two words either side of fox, jumped and dog, you
really dont want to end up with
quick brown fox jumped over brown fox jumped over the the lazy dog
so you need a small amount of smarts. This module has a small amount of
This is primarily an object-oriented module. If you dont care about
that, just import the get_context subroutine, and call it like so:
get_context($num_of_words, $text, @words_to_find)
and youll get back a string with ellipses as in the synopsis. Thats
all that most people need to know. But if you want to do clever stuff...
my $c = Text::Context::EitherSite->new($text [, context=> $n]);
Create a new object storing some text to be searched, plus optionally
some information about how many words on either side you want. (If you
dont like the default of 2.)
Allows you to get and set the number of the words on either side.
Returns the keywords, plus n words on either side, as a sparse list;
the original text is split into an array of words, and non-contextual
elements are replaced with undefs. (Thats not actually how it works,
but conceptually, its the same.)
The same as as_sparse_list, but single or multiple undefs are
collapsed into a single ellipsis:
(undef, "foo", undef, undef, undef, "bar")
("...", "foo", "...", "bar")
Takes the as_list output above and joins them all together into a
string. This is what most people want from Text::Context::EitherSide.
get_context is available as a shortcut for
Text::Context::EitherSide->new($text, context => $n)->as_string(@words);
but needs to be explicitly imported. Nothing is exported by default.
Text::Context is an even smarter way of extracting a contextual
Current maintainer: Tony Bowden
Original author: Simon Cozens
BUGS and QUERIES
Please direct all correspondence regarding this module to:
COPYRIGHT AND LICENSE
Copyright 2002-2005 by Kasei Limited, http://www.kasei.com/
You may use and redistribute this module under the terms of the
Artistic License 2.0.
|perl v5.20.3 ||TEXT::CONTEXT::EITHERSIDE (3) ||2009-05-04 |
Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.