Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Contact Us
Online Help
Domain Status
Man Pages

Virtual Servers

Topology Map

Server Agreement
Year 2038

USA Flag



Man Pages
XML::SAX::ByRecord(3) User Contributed Perl Documentation XML::SAX::ByRecord(3)

XML::SAX::ByRecord - Record oriented processing of (data) documents

version 0.46

    use XML::SAX::Machines qw( ByRecord ) ;
    my $m = ByRecord(
            Handler => $h, ## optional
    $m->parse_uri( "foo.xml" );

XML::SAX::ByRecord is a SAX machine that treats a document as a series of records. Everything before and after the records is emitted as-is while the records are excerpted in to little mini-documents and run one at a time through the filter pipeline contained in ByRecord.
The output is a document that has the same exact things before, after, and between the records that the input document did, but which has run each record through a filter. So if a document has 10 records in it, the per-record filter pipeline will see 10 sets of ( start_document, body of record, end_document ) events. An example is below.
This has several use cases:
Big, record oriented documents
Big documents can be treated a record at a time with various DOM oriented processors like XML::Filter::XSLT.
Streaming XML
Small sections of an XML stream can be run through a document processor without holding up the stream.
Record oriented style sheets / processors
Sometimes it's just plain easier to write a style sheet or SAX filter that applies to a single record at at time, rather than having to run through a series of records.

Here's how the innards look:
   |                  An XML:SAX::ByRecord                     |
   |    Intake                                                 |
   |   +----------+    +---------+         +--------+  Exhaust |
 --+-->| Splitter |--->| Stage_1 |-->...-->| Merger |----------+----->
   |   +----------+    +---------+         +--------+          |
   |               \                            ^              |
   |                \                           |              |
   |                 +---------->---------------+              |
   |                   Events not in any records               |
   |                                                           |
The "Splitter" is an XML::Filter::DocSplitter by default, and the "Merger" is an XML::Filter::Merger by default. The line that bypasses the "Stage_1 ..." filter pipeline is used for all events that do not occur in a record. All events that occur in a record pass through the filter pipeline.

Here's a quick little filter to uppercase text content:
    package My::Filter::Uc;
    use vars qw( @ISA );
    @ISA = qw( XML::SAX::Base );
    use XML::SAX::Base;
    sub characters {
        my $self = shift;
        my ( $data ) = @_;
        $data->{Data} = uc $data->{Data};
        $self->SUPER::characters( @_ );
And here's a little machine that uses it:
    $m = Pipeline(
        ByRecord( "My::Filter::Uc" ),
When fed a document like:
    <root> a
        <rec>b</rec> c
        <rec>d</rec> e
        <rec>f</rec> g
the output looks like:
    <root> a
        <rec>B</rec> c
        <rec>C</rec> e
        <rec>D</rec> g
and the My::Filter::Uc got three sets of events like:
    start_element: <rec>
    characters:    'b'
    end_element:   </rec>
    start_element: <rec>
    characters:    'd'
    end_element:   </rec>
    start_element: <rec>
    characters:   'f'
    end_element:   </rec>

XML::SAX::ByRecord - Record oriented processing of (data) documents

    my $d = XML::SAX::ByRecord->new( @channels, \%options );
Longhand for calling the ByRecord function exported by XML::SAX::Machines.

Proposed by Matt Sergeant, with advise by Kip Hampton and Robin Berjon.

To be written. Pretty much just that "start_manifold_processing" and "end_manifold_processing" need to be provided. See XML::Filter::Merger and it's source code for a starter.

Barry Slaymaker
Chris Prather <>

This software is copyright (c) 2013 by Barry Slaymaker.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
2013-08-19 perl v5.28.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.