GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
LibXML(3) User Contributed Perl Documentation LibXML(3)

XML::Filter::DOMFilter::LibXML - SAX Filter allowing DOM processing of selected subtrees

  use XML::LibXML;
  use XML::Filter::DOMFilter::LibXML;

  my $filter = XML::Filter::DOMFilter::LibXML->new(
        Handler => $handler,
        XPathContext => XML::LibXML::XPathContext->new(),
        Process => [
                    '/foo[@A='aaa']/*/bar'    => \&process_bar,
                    'baz[parent::*/@B='bbb']' => \&process_baz
                   ]
      );

  my $parser = XML::SAX::YourFavoriteDriver->new( Handler => $filter );

  # Some DOM processing

  sub process_bar {
    my ($node)=@_;
    my $doc=$node->ownerDocument;
    $node->appendTextChild("note","hallo world!");
    $node->parentNode->insertAfter($doc->createElement("foo"),$node);
  }

  sub process_baz {
    my ($node)=@_;
    $node->unbindNode;
  }

This module provides a compromise between SAX and DOM processing by allowing to use DOM API to process only reasonably small parts of an XML document. It works as a SAX filter temporarily building small DOM trees around parts selected by given XPath expressions (with some limitations, see "LIMITATIONS").

The filter has two states which will be refered to as A and B here. The initial state of the filter is A.

In the state A, only a limited vertical portion of the DOM tree is built. All SAX events other than start_element are immediatelly passed to Handler. On start_element event, a new element node is created in the DOM tree. All possible existing siblings of the newly created node are removed. Thus, while in state A, there is exactly one node on every level of the tree. Now all the XPath expressions are checked in the context of the newly created node. If none of the expressions matches, the parser remains in state A and passes the start_element event to Handler. Otherwise, the callback associated with the first expression that matched is remembered and the parser changes its state to B.

In state B the filter builds a complete DOM subtree of the new element according to the incomming events. No events are passed to Handler at this stage. When the subtree is complete (i.e. the corresponding end-tag is encountered), the callback associated with the XPath expression that matched is executed. The root element of the subtree is passed to the callback subroutine as the only argument.

The callback is allowed to do any DOM operations on the DOM subtree, even to replace it with one or more new subtrees. The callack must, however, preserve the element's parent node as well as all its ancestor nodes intact. Failing to do so can result in an error or unpredictable results.

When the callback returns, all subtrees that now appear in the DOM tree under the original element parent are serialized to SAX events and passed to Handler. After that, they are deleted from the DOM tree and the filter returns to state A.

Note that this type of processing highly limits the amount of information the XPath engine can use. Most notably, elements cannot be selected by their content. The only information present in the tree at the time of the XPath evaluation is the element's name and attributes and the same information for all its ancestors. There is nothing known about possible child nodes of the element as well as of its position within its siblings at the time the XPath expressions are evaluated.

This filter is built upon XML::LibXML::SAX::Builder module.
new
This is the constructor for this object. It takes a several parameters, some of which are optional.

    XML::Filter::DOMFilter::LibXML->new(
         Handler => $handler,
         XPathContext => $xpath_context,
         Process => [ XPath => Code, XPath => Code, ... ]
       );
    

Handler - Optional output SAX handler.

XPathContext - Optional XML::LibXML::XPathContext object to be used for XPath queries. In some cases it might be useful as it allows registering namespace prefixes etc.

Process - Required. An array reference of the form "[ XPath => Code, XPath => Code, ...]" where XPath is a string containing an XPath expression and Code is a callback CODE reference.

None.

Petr Pajas, <pajas@ufal.ms.mff.cuni.cz>

XML::LibXML, XML::LibXML::SAX, XML::LibXML::XPathContext.

Hey! The above document had some coding errors, which are explained below:
Around line 300:
Expected text after =item, not a bullet
2015-11-05 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.