|
NAMESet::IntSpan - Manages sets of integersSYNOPSIS# BEGIN { $Set::IntSpan::integer = 1 } use Set::IntSpan qw(grep_set map_set grep_spans map_spans); # $Set::IntSpan::Empty_String = '-'; # or ''; $set = new Set::IntSpan $set_spec; $set = new Set::IntSpan @set_specs; $valid = valid Set::IntSpan $run_list; $set = copy $set $set_spec; $run_list = run_list $set; @elements = elements $set; @sets = sets $set; @spans = spans $set; $u_set = union $set $set_spec; $i_set = intersect $set $set_spec; $x_set = xor $set $set_spec; $d_set = diff $set $set_spec; $c_set = complement $set; $set->U($set_spec); # Union $set->I($set_spec); # Intersect $set->X($set_spec); # Xor $set->D($set_spec); # Diff $set->C; # Complement equal $set $set_spec equivalent $set $set_spec superset $set $set_spec subset $set $set_spec $n = cardinality $set; $n = size $set; empty $set finite $set neg_inf $set pos_inf $set infinite $set universal $set member $set $n; insert $set $n; remove $set $n; $min = min $set; $max = max $set; $holes = holes $set; $cover = cover $set; $inset = inset $set $n; $smaller = trim $set $n; $bigger = pad $set $n; $subset = grep_set { ... } $set; $mapset = map_set { ... } $set; $subset = grep_spans { ... } $set; $mapset = map_spans { ... } $set; for ($element=$set->first; defined $element; $element=$set->next) { ... } for ($element=$set->last ; defined $element; $element=$set->prev) { ... } $element = $set->start($n); $element = $set->current; $n = $set->at($i); $slice = $set->slice($from, $to); $i = $set->ord($n); $i = $set->span_ord($n); Operator overloads$u_set = $set + $set_spec; # union $i_set = $set * $set_spec; # intersect $x_set = $set ^ $set_spec; # xor $d_set = $set - $set_spec; # diff $c_set = ~$set; # complement $set += $set_spec; # union $set *= $set_spec; # intersect $set ^= $set_spec; # xor $set -= $set_spec; # diff $set eq $set_spec # equal $set ne $set_spec # not equal $set le $set_spec # subset $set lt $set_spec # proper subset $set ge $set_spec # superset $set gt $set_spec # proper superset # compare sets by cardinality $set1 == $set2 $set1 != $set2 $set1 <= $set2 $set1 < $set2 $set1 >= $set2 $set1 > $set2 $set1 <=> $set2 # compare cardinality of set to an integer $set1 == $n $set1 != $n $set1 <= $n $set1 < $n $set1 >= $n $set1 > $n $set1 <=> $n @sorted = sort @sets; # sort sets by cardinality if ($set) { ... } # true if $set is not empty print "$set\n"; # stringizes to the run list EXPORTS@EXPORTNothing@EXPORT_OK"grep_set", "map_set", "grep_spans", "map_spans"DESCRIPTION"Set::IntSpan" manages sets of integers. It is optimized for sets that have long runs of consecutive integers. These arise, for example, in .newsrc files, which maintain lists of articles:alt.foo: 1-21,28,31 alt.bar: 1-14192,14194,14196-14221 A run of consecutive integers is also called a span. Sets are stored internally in a run-length coded form. This provides for both compact storage and efficient computation. In particular, set operations can be performed directly on the encoded representation. "Set::IntSpan" is designed to manage finite sets. However, it can also represent some simple infinite sets, such as { x | x>n }. This allows operations involving complements to be carried out consistently, without having to worry about the actual value of INT_MAX on your machine. SPANSA span is a run of consecutive integers. A span may be represented by an array reference, in any of 5 forms:Finite formsSpan Set [ $n, $n ] { n } [ $a, $b ] { x | a<=x && x<=b} Infinite formsSpan Set [ undef, $b ] { x | x<=b } [ $a , undef ] { x | x>=a } [ undef, undef ] The set of all integers Some methods operate directly on spans. SET SPECIFICATIONSMany of the methods take a set specification. There are four kinds of set specifications.EmptyIf a set specification is omitted, then the empty set is assumed. Thus,$set = new Set::IntSpan; creates a new, empty set. Similarly, copy $set; removes all elements from $set. Object referenceIf an object reference is given, it is taken to be a "Set::IntSpan" object.Run listIf a string is given, it is taken to be a run list. A run list specifies a set using a syntax similar to that in newsrc files.A run list is a comma-separated list of runs. Each run specifies a set of consecutive integers. The set is the union of all the runs. Runs may be written in any of 5 forms. Finite forms
Infinite forms
Empty forms The empty set is consistently written as '' (the null string). It is also denoted by the special form '-' (a single dash). Restrictions The runs in a run list must be disjoint, and must be listed in increasing order. Valid characters in a run list are 0-9, '(', ')', '-' and ','. White space and underscore (_) are ignored. Other characters are not allowed. Examples Run list Set "-" { } "1" { 1 } "1-2" { 1, 2 } "-5--1" { -5, -4, -3, -2, -1 } "(-)" the integers "(--1" the negative integers "1-3, 4, 18-21" { 1, 2, 3, 4, 18, 19, 20, 21 } Array referenceIf an array reference is given, then the elements of the array specify the elements of the set. The array may contain
The set is the union of all the integers and spans in the array. The integers and spans need not be disjoint. The integers and spans may be in any order. Examples Array ref Set [ ] { } [ 1, 1 ] { 1 } [ 1, 3, 2 ] { 1, 2, 3 } [ 1, [ 5, 8 ], 5, [ 7, 9 ], 2 ] { 1, 2, 5, 6, 7, 8, 9 } [ undef, undef ] the integers [ undef, -1 ] the negative integers ITERATORSEach set has a single iterator, which is shared by all calls to "first", "last", "start", "next", "prev", and "current". At all times, the iterator is either an element of the set, or "undef"."first", "last", and "start" set the iterator; "next", and "prev" move it; and "current" returns it. Calls to these methods may be freely intermixed. Using "next" and "prev", a single loop can move both forwards and backwards through a set. Using "start", a loop can iterate over portions of an infinite set. METHODSCreation
Set operationsFor these operations, a new "Set::IntSpan" object is created and returned. The operands are not affected.
MutatorsBy popular demand, "Set::IntSpan" now has mutating forms of the binary set operations. These methods alter the object on which they are called.
Comparison
Cardinality
Membership
Extrema
Spans
Iterators
IndexingThe elements of a set are kept in numerical order. These methods index into the set based on this ordering.
OPERATOR OVERLOADSFor convenience, some operators are overloaded on "Set::IntSpan" objects.set operationsOne operand must be a "Set::IntSpan" object. The other operand may be a "Set::IntSpan" object or a set specification.$u_set = $set + $set_spec; # union $i_set = $set * $set_spec; # intersect $x_set = $set ^ $set_spec; # xor $d_set = $set - $set_spec; # diff $c_set = ~$set; # complement $set += $set_spec; # union $set *= $set_spec; # intersect $set ^= $set_spec; # xor $set -= $set_spec; # diff equalityThe string comparison operations are overloaded to compare sets for equality and containment. One operand must be a "Set::IntSpan" object. The other operand may be a "Set::IntSpan" object or a set specification.$set eq $set_spec # equal $set ne $set_spec # not equal $set le $set_spec # subset $set lt $set_spec # proper subset $set ge $set_spec # superset $set gt $set_spec # proper superset equivalenceThe numerical comparison operations are overloaded to compare sets by cardinality. One operand must be a "Set::IntSpan" object. The other operand may be a "Set::IntSpan" object or an integer.$set1 == $set2 $set1 != $set2 $set1 <= $set2 $set1 < $set2 $set1 >= $set2 $set1 > $set2 $set1 <=> $set2 $set1 cmp $set2 $set1 == $n $set1 != $n $set1 <= $n $set1 < $n $set1 >= $n $set1 > $n $set1 <=> $n $set1 cmp $n N.B. The "cmp" operator is overloaded to compare sets by cardinality, not containment. This is done so that sort @sets will sort a list of sets by cardinality. conversionIn boolean context, a "Set::IntSpan" object evaluates to true if it is not empty.A "Set::IntSpan" object stringizes to its run list. FUNCTIONS
CLASS VARIABLES
DIAGNOSTICSAny method (except "valid") will "die" if it is passed an invalid run list.
NOTESTrapsBeware of forms likeunion $set [1..5]; This passes an element of @set to union, which is probably not what you want. To force interpretation of $set and [1..5] as separate arguments, use forms like union $set +[1..5]; or $set->union([1..5]); grep_set and map_set"grep_set" and "map_set" make it easy to construct sets for which the internal representation used by "Set::IntSpan" is not small. Consider:$billion = new Set::IntSpan '0-1_000_000_000'; # OK $odd = grep_set { $_ & 1 } $billion; # trouble $even = map_set { $_ * 2 } $billion; # double trouble Error handlingThere are two common approaches to error handling: exceptions and return codes. There seems to be some religion on the topic, so "Set::IntSpan" provides support for both.To catch exceptions, protect method calls with an eval: $run_list = <STDIN>; eval { $set = new Set::IntSpan $run_list }; $@ and print "$@: try again\n"; To check return codes, use an appropriate method call to validate arguments: $run_list = <STDIN>; if (valid Set::IntSpan $run_list) { $set = new Set::IntSpan $run_list } else { print "$@ try again\n" } Similarly, use "finite" to protect calls to "elements": finite $set and @elements = elements $set; Calling "elements" on a large, finite set can generate an "Out of memory!" message, which cannot (easily) be trapped. Applications that must retain control after an error can use "intersect" to protect calls to "elements": @elements = elements { intersect $set "-1_000_000 - 1_000_000" }; or check the size of $set first: finite $set and cardinality $set < 2_000_000 and @elements = elements $set; LimitationsAlthough "Set::IntSpan" can represent some infinite sets, it does not perform infinite-precision arithmetic. Therefore, finite elements are restricted to the range of integers on your machine.ExtensionsUsers report that you can construct Set::IntSpan objects on anything that behaves like an integer. For example:$x = new Math::BigInt ...; $set = new Set::Intspan [ [ $x, $x+5 ] ]; I'm not documenting this as supported behavior, because I don't have the resources to test it, but I'll try not to break it. If anyone finds problems with it, let me know. RootsThe sets implemented here are based on a Macintosh data structure called a region. See Inside Macintosh for more information."Set::IntSpan" was originally written to manage run lists for the "News::Newsrc" module. AUTHORSteven McDougall <swmcd@world.std.com>ACKNOWLEDGMENTS
COPYRIGHTCopyright (c) 1996-2013 by Steven McDougall. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Visit the GSP FreeBSD Man Page Interface. |