NAME

Data::Walk - Traverse Perl data structures

SYNOPSIS

 use Data::Walk;    
 walk \&wanted, @items_to_walk;

 use Data::Walk;    
 walkdepth \&wanted, @items_to_walk;
    
 use Data::Walk;    
 walk { wanted => \&process, follow => 1 }, $self;

DESCRIPTION

The above synopsis bears an amazing similarity to File::Find(3pm) and this is not coincidental.

Data::Walk(3pm) is for data what File::Find(3pm) is for files. You can use it for rolling your own serialization class, for displaying Perl data structures, for deep copying or comparing, for recursive deletion of data, or ...

If you are impatient and already familiar with File::Find(3pm), you can skip the following documentation and proceed with "DIFFERENCES TO FILE::FIND".

FUNCTIONS

The module exports two functions by default:

walk

  walk \&wanted, @items;
  walk \%options, @items;

As the name suggests, the function traverses the items in the order they are given. For every object visited, it calls the &wanted subroutine. See "THE WANTED FUNCTION" for details.

walkdepth

  walkdepth \&wanted, @items;
  walkdepth \%options, @items;

Works exactly like "walk()" but it first descends deeper into the structure, before visiting the nodes on the current level. If you want to delete visited nodes, then "walkdepth()" is probably your friend.

OPTIONS

The first argument to "walk()" and "walkdepth()" is either a code reference to your &wanted function, or a hash reference describing the operations to be performed for each visited node.

Here are the possible keys for the hash.

wanted

The value should be a code reference. This code reference is described in "THE WANTED FUNCTION" below.

bydepth

Visits nodes on the current level of recursion only after descending into subnotes. The entry point "walkdepth()" is a shortcut for specifying "{ bydepth => 1 }".

preprocess

The value should be a code reference. This code reference is used to preprocess the current node $Data::Walk::container. Your preprocessing function is called before the loop that calls the "wanted()" function. It is called with a list of member nodes and is expected to return such a list. The list will contain all sub-nodes, regardless of the value of the option follow! The list is a shallow copy of the data contained in the original structure. You can therefore safely delete items in it, without affecting the original data.

The behavior is identical for regular arrays and hashes, so you probably want to coerce the list passed as an argument into a hash then. The variable $Data::Walk::type will contain the string "HASH" if the currently inspected node is a hash.

You can use the preprocessing function to sort the items contained or to filter out unwanted items. The order is also preserved for hashes!

preprocess_hash

The value should be a code reference. The code is executed right after an eventual preprocess_hash handler, but only if the current container is a hash. It is skipped for regular arrays.

You will usually prefer a preprocess_hash handler over a preprocess handler if you only want to sort hash keys.

postprocess

The value should be a code reference. It is invoked just before leaving the currently visited node. It is called in void context with no arguments. The variable $Data::Walk::container points to the currently visited node.

follow

Causes cyclic references to be followed. Normally, the traversal will not descend into nodes that have already been visited. If you set the option follow to a truth value, you can change this behavior. Unless you take additional measures, this will always imply an infinite loop!

Please note that the &wanted function is also called for nodes that have already been visited! The effect of follow is to suppress descending into subnodes.

All other options are silently ignored.

THE WANTED FUNCTION

The &wanted function does whatever verifications you want on each item in the data structure. Note that despite its name, the &wanted function is a generic callback and does not tell Data::Walk(3pm) if an item is "wanted" or not. In fact, its return value is ignored.

The wanted function takes no arguments but rather does its work through a collection of variables:

$_

The currently visited node. Think "file" in terms of File::Find(3pm)!

$Data::Walk::container

The node containing the currently visited node, either a reference to a hash or an array. Think "directory" in terms of File::Find(3pm)!

$Data::Walk::type

The base type of the object that $Data::Walk::container references. This is either "ARRAY" or "HASH" or the empty string for everything else.

$Data::Walk::seen

For references, this will hold the number of times the currently visited node has been visited before. The value is consequently set to 0 not 1 on the first visit. For non-references, the value is undefined.

$Data::Walk::address

For references, this will hold the memory address it points to. It can be used as a unique identifier for the current node. For non- references, the value is undefined.

$Data::Walk::depth

The depth of the current recursion.

$Data::Walk::index

Holds the index of the current item in the container. Note that hashes and arrays are treated the same. Therefore, if the current container is a hash and $Data::Walk::index is even then $_ is a hash key. If it is odd, then $_ is a hash value.

Note that the root container is the array of items to search that you passed to the wanted function!

This variable has been added in Data::Walk version 1.01.

These variables should not be modified.

DIFFERENCES TO FILE::FIND

The API of Data::Walk(3pm) tries to mimic the API of File::Find(3pm) to a certain extent. If you are already familiar with File::Find(3pm) you will find it very easy to use Data::Walk(3pm). Even the documentation for Data::Walk(3pm) is in parts similar or identcal to that of File::Find(3pm).

Analogies

The equivalent of directories in File::Find(3pm) are the container data types in Data::Walk(3pm). Container data types are arrays (aka lists) and associative arrays (aka hashes). Files are equivalent to scalars. Wherever File::Find(3pm) passes lists of strings to functions, Data::Walk(3pm) passes lists of variables.

Function Names

Instead of "find()" and "finddepth()", Data::Walk(3pm) uses "walk()" and "walkdepth()", like the smart reader has already guessed after reading the "SYNOPSIS".

Variables

The variable $Data::Walk::container is vaguely equivalent to $File::Find::dir. All other variables are specific to the corresponding module.

Wanted Function

Like its archetype from File::Find(3pm), the wanted function of Data::Walk(3pm) is called with $_ set to the currently inspected item.

Options

The option follow has the effect that Data::Walk(3pm) also descends into nodes it has already visited. Unless you take extra measures, this will lead to an infinite loop!

A number of options are not applicable to data traversion and are ignored by Data::Walk(3pm). Examples are follow_fast, follow_skip, no_chdir, untaint, untaint_pattern, and untaint_skip. To give truth the honor, all unrecognized options are skipped.

EXAMPLES

Following are some recipies for common tasks.

Recurse To Maximum Depth

If you want to stop the recursion at a certain level, do it as follows:

    my $max_depth = 20;
    sub not_too_deep {
        if ($Data::Walk::depth > $max_depth) {
        return ();
        } else {
        return @_;
        }
    }
    sub do_something1 {
        # Your code goes here.
    }
    walk { wanted => \&do_something, preprocess => \&not_too_deep };

BUGS

If you think you have spotted a bug, you can share it with others in the bug tracking system at http://rt.cpan.org/NoAuth/Bugs.html?Dist=Data-Walk.

COPYING

This program is free software; you can redistribute it and/or modify it under the terms of the GNU Library General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details.

You should have received a copy of the GNU Library General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.