GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  BLOOM::FILTER (3)

.ds Aq ’

NAME

Bloom::Filter - Sample Perl Bloom filter implementation

CONTENTS

DESCRIPTION

A Bloom filter is a probabilistic algorithm for doing existence tests in less memory than a full list of keys would require. The tradeoff to using Bloom filters is a certain configurable risk of false positives. This module implements a simple Bloom filter with configurable capacity and false positive rate. Bloom filters were first described in a 1970 paper by Burton Bloom, see <http://portal.acm.org/citation.cfm?id=362692&dl=ACM&coll=portal>.

SYNOPSIS



        use Bloom::Filter

        my $bf = Bloom::Filter->new( capacity => 10, error_rate => .001 );

        $bf->add( @keys );

        while ( <> ) {
                chomp;
                print "Found $_\n" if $bf->check( $_ );
        }



CONSTRUCTORS

new %PARAMS Create a brand new instance. Allowable params are error_rate, capacity.
init Calculates the best number of hash functions and optimum filter length, creates some random salts, and generates a blank bit vector. Called automatically by constructor.

ACCESSORS

capacity Returns the total capacity of the Bloom filter
error_rate Returns the configured maximum error rate
length Returns the length of the Bloom filter in bits
key_count Returns the number of items currently stored in the filter
on_bits Returns the number of ’on’ bits in the filter
salts Returns the list of salts used to create the hash functions

PUBLIC METHODS

add @KEYS Adds the list of keys to the filter. Will fail, return undef and complain if the number of keys in the filter exceeds the configured capacity.
check @KEYS Checks the provided key list against the Bloom filter, and returns a list of equivalent length, with true or false values depending on whether there was a match.

INTERNAL METHODS

_calculate_shortest_filter_length CAPACITY ERR_RATE Given a desired error rate and maximum capacity, returns the optimum combination of vector length (in bits) and number of hash functions to use in building the filter, where optimum means shortest vector length.
_get_cells KEY Given a key, hashes it using the list of salts and returns an array of cell indexes corresponding to the key.

AUTHOR

Maciej Ceglowski <maciej@ceglowski.com>

CHANGELOG

Feb 2007 big speedup by Dmitriy Ryaboy <dmitriy.ryaboy@ask.com> (thanks!)

COPYRIGHT AND LICENSE

(c) 2004 Maciej Ceglowski

This is free software, distributed under version 2 of the GNU Public License (GPL).

Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 FILTER (3) 2007-02-10

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.