GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  HTML::ELEMENT::TRAVERSE (3)

.ds Aq ’

NAME

HTML::Element::traverse - discussion of HTML::Element’s traverse method

CONTENTS

VERSION

This document describes version 5.03 of HTML::Element::traverse, released September 22, 2012 as part of HTML-Tree.

SYNOPSIS



  # $element->traverse is unnecessary and obscure.
  #   Dont use it in new code.



DESCRIPTION

HTML::Element provides a method traverse that traverses the tree and calls user-specified callbacks for each node, in pre- or post-order. However, use of the method is quite superfluous: if you want to recursively visit every node in the tree, it’s almost always simpler to write a subroutine does just that, than it is to bundle up the pre- and/or post-order code in callbacks for the traverse method.

EXAMPLES

Suppose you want to traverse at/under a node $tree and give elements an ’id’ attribute unless they already have one.

You can use the traverse method:



  {
    my $counter = x0000;
    $start_node->traverse(
      [ # Callbacks;
        # pre-order callback:
        sub {
          my $x = $_[0];
          $x->attr(id, $counter++) unless defined $x->attr(id);
          return HTML::Element::OK; # keep traversing
        },
        # post-order callback:
        undef
      ],
      1, # dont call the callbacks for text nodes
    );
  }



or you can just be simple and clear (and not have to understand the calling format for traverse) by writing a sub that traverses the tree by just calling itself:



  {
    my $counter = x0000;
    sub give_id {
      my $x = $_[0];
      $x->attr(id, $counter++) unless defined $x->attr(id);
      foreach my $c ($x->content_list) {
        give_id($c) if ref $c; # ignore text nodes
      }
    };
    give_id($start_node);
  }



See, isn’t that nice and clear?

But, if you really need to know:

THE TRAVERSE METHOD

The traverse() method is a general object-method for traversing a tree or subtree and calling user-specified callbacks. It accepts the following syntaxes:
$h->traverse(\&callback)
or $h->traverse(\&callback, $ignore_text)
or $h->traverse( [\&pre_callback,\&post_callback] , $ignore_text)
These all mean to traverse the element and all of its children. That is, this method starts at node $h, pre-order visits $h, traverses its children, and then will post-order visit $h. Visiting means that the callback routine is called, with these arguments:



    $_[0] : the node (element or text segment),
    $_[1] : a startflag, and
    $_[2] : the depth



If the $ignore_text parameter is given and true, then the pre-order call will not be happen for text content.

The startflag is 1 when we enter a node (i.e., in pre-order calls) and 0 when we leave the node (in post-order calls).

Note, however, that post-order calls don’t happen for nodes that are text segments or are elements that are prototypically empty (like br, hr, etc.).

If we visit text nodes (i.e., unless $ignore_text is given and true), then when text nodes are visited, we will also pass two extra arguments to the callback:



    $_[3] : the element thats the parent
             of this text node
    $_[4] : the index of this text node
             in its parents content list



Note that you can specify that the pre-order routine can be a different routine from the post-order one:



    $h->traverse( [\&pre_callback,\&post_callback], ...);



You can also specify that no post-order calls are to be made, by providing a false value as the post-order routine:



    $h->traverse([ \&pre_callback,0 ], ...);



And similarly for suppressing pre-order callbacks:



    $h->traverse([ 0,\&post_callback ], ...);



Note that these two syntaxes specify the same operation:



    $h->traverse([\&foo,\&foo], ...);
    $h->traverse( \&foo       , ...);



The return values from calls to your pre- or post-order routines are significant, and are used to control recursion into the tree.

These are the values you can return, listed in descending order of my estimation of their usefulness:
HTML::Element::OK, 1, or any other true value ...to keep on traversing.

Note that HTML::Element::OK et al are constants. So if you’re running under use strict (as I hope you are), and you say: return HTML::Element::PRUEN the compiler will flag this as an error (an unallowable bareword, specifically), whereas if you spell PRUNE correctly, the compiler will not complain.

undef, 0, ’0’, ’’, or HTML::Element::PRUNE ...to block traversing under the current element’s content. (This is ignored if received from a post-order callback, since by then the recursion has already happened.) If this is returned by a pre-order callback, no post-order callback for the current node will happen. (Recall that if your callback exits with just return;, it is returning undef — at least in scalar context, and traverse always calls your callbacks in scalar context.)
HTML::Element::ABORT ...to abort the whole traversal immediately. This is often useful when you’re looking for just the first node in the tree that meets some criterion of yours.
HTML::Element::PRUNE_UP ...to abort continued traversal into this node and its parent node. No post-order callback for the current or parent node will happen.
HTML::Element::PRUNE_SOFTLY Like PRUNE, except that the post-order call for the current node is not blocked.
Almost every task to do with extracting information from a tree can be expressed in terms of traverse operations (usually in only one pass, and usually paying attention to only pre-order, or to only post-order), or operations based on traversing. (In fact, many of the other methods in this class are basically calls to traverse() with particular arguments.)

The source code for HTML::Element and HTML::TreeBuilder contain several examples of the use of the traverse method to gather information about the content of trees and subtrees.

(Note: you should not change the structure of a tree while you are traversing it.)

[End of documentation for the traverse() method]

    Traversing with Recursive Anonymous Routines

Now, if you’ve been reading Structure and Interpretation of Computer Programs too much, maybe you even want a recursive lambda. Go ahead:



  {
    my $counter = x0000;
    my $give_id;
    $give_id = sub {
      my $x = $_[0];
      $x->attr(id, $counter++) unless defined $x->attr(id);
      foreach my $c ($x->content_list) {
        $give_id->($c) if ref $c; # ignore text nodes
      }
    };
    $give_id->($start_node);
    undef $give_id;
  }



It’s a bit nutty, and it’s still more concise than a call to the traverse method!

It is left as an exercise to the reader to figure out how to do the same thing without using a $give_id symbol at all.

It is also left as an exercise to the reader to figure out why I undefine $give_id, above; and why I could achieved the same effect with any of:



    $give_id = I like pie!;
   # or...
    $give_id = [];
   # or even;
    $give_id = sub { print "Mmmm pie!\n" };



But not:



    $give_id = sub { print "Im $give_id and I like pie!\n" };
   # nor...
    $give_id = \$give_id;
   # nor...
    $give_id = { pie => \$give_id, mode => a la };



    Doing Recursive Things Iteratively

Note that you may at times see an iterative implementation of pre-order traversal, like so:



   {
     my @to_do = ($tree); # start-node
     while(@to_do) {
       my $this = shift @to_do;

       # "Visit" the node:
       $this->attr(id, $counter++)
        unless defined $this->attr(id);

       unshift @to_do, grep ref $_, $this->content_list;
        # Put children on the stack -- theyll be visited next
     }
   }



This can under certain circumstances be more efficient than just a normal recursive routine, but at the cost of being rather obscure. It gains efficiency by avoiding the overhead of function-calling, but since there are several method dispatches however you do it (to attr and content_list), the overhead for a simple function call is insignificant.

    Pruning and Whatnot

The traverse method does have the fairly neat features of the ABORT, PRUNE_UP and PRUNE_SOFTLY signals. None of these can be implemented totally straightforwardly with recursive routines, but it is quite possible. ABORT-like behavior can be implemented either with using non-local returning with eval/die:



  my $died_on; # if you need to know where...
  sub thing {
    ... visits $_[0]...
    ... maybe set $died_on to $_[0] and die "ABORT_TRAV" ...
    ... else call thing($child) for each child...
    ...any post-order visiting $_[0]...
  }
  eval { thing($node) };
  if($@) {
    if($@ =~ m<^ABORT_TRAV>) {
      ...it died (aborted) on $died_on...
    } else {
      die $@; # some REAL error happened
    }
  }



or you can just do it with flags:



  my($abort_flag, $died_on);
  sub thing {
    ... visits $_[0]...
    ... maybe set $abort_flag = 1; $died_on = $_[0]; return;
    foreach my $c ($_[0]->content_list) {
      thing($c);
      return if $abort_flag;
    }
    ...any post-order visiting $_[0]...
    return;
  }

  $abort_flag = $died_on = undef;
  thing($node);
  ...if defined $abort_flag, it died on $died_on



SEE ALSO

HTML::Element

AUTHOR

Current maintainers:
o Christopher J. Madsen <perl AT cjmweb.net>
o Jeff Fearn <jfearn AT cpan.org>
Original HTML-Tree author:
o Gisle Aas
Former maintainers:
o Sean M. Burke
o Andy Lester
o Pete Krawczyk <petek AT cpan.org>
You can follow or contribute to HTML-Tree’s development at <http://github.com/madsen/HTML-Tree>.

COPYRIGHT

Copyright 2000,2001 Sean M. Burke
Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 HTML::ELEMENT::TRAVERSE (3) 2016-04-03

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.