GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  XML::SAX::BYRECORD (3)

.ds Aq ’

NAME

XML::SAX::ByRecord - Record oriented processing of (data) documents

CONTENTS

VERSION

version 0.46

SYNOPSIS



    use XML::SAX::Machines qw( ByRecord ) ;

    my $m = ByRecord(
        "My::RecordFilter1",
        "My::RecordFilter2",
        ...
        {
            Handler => $h, ## optional
        }
    );

    $m->parse_uri( "foo.xml" );



DESCRIPTION

XML::SAX::ByRecord is a SAX machine that treats a document as a series of records. Everything before and after the records is emitted as-is while the records are excerpted in to little mini-documents and run one at a time through the filter pipeline contained in ByRecord.

The output is a document that has the same exact things before, after, and between the records that the input document did, but which has run each record through a filter. So if a document has 10 records in it, the per-record filter pipeline will see 10 sets of ( start_document, body of record, end_document ) events. An example is below.

This has several use cases:
o Big, record oriented documents

Big documents can be treated a record at a time with various DOM oriented processors like XML::Filter::XSLT.

o Streaming XML

Small sections of an XML stream can be run through a document processor without holding up the stream.

o Record oriented style sheets / processors

Sometimes it’s just plain easier to write a style sheet or SAX filter that applies to a single record at at time, rather than having to run through a series of records.

    Topology

Here’s how the innards look:



   +-----------------------------------------------------------+
   |                  An XML:SAX::ByRecord                     |
   |    Intake                                                 |
   |   +----------+    +---------+         +--------+  Exhaust |
 --+-->| Splitter |--->| Stage_1 |-->...-->| Merger |----------+----->
   |   +----------+    +---------+         +--------+          |
   |               \                            ^              |
   |                \                           |              |
   |                 +---------->---------------+              |
   |                   Events not in any records               |
   |                                                           |
   +-----------------------------------------------------------+



The Splitter is an XML::Filter::DocSplitter by default, and the Merger is an XML::Filter::Merger by default. The line that bypasses the Stage_1 ... filter pipeline is used for all events that do not occur in a record. All events that occur in a record pass through the filter pipeline.

    Example

Here’s a quick little filter to uppercase text content:



    package My::Filter::Uc;

    use vars qw( @ISA );
    @ISA = qw( XML::SAX::Base );

    use XML::SAX::Base;

    sub characters {
        my $self = shift;
        my ( $data ) = @_;
        $data->{Data} = uc $data->{Data};
        $self->SUPER::characters( @_ );
    }



And here’s a little machine that uses it:



    $m = Pipeline(
        ByRecord( "My::Filter::Uc" ),
        \$out,
    );



When fed a document like:



    <root> a
        <rec>b</rec> c
        <rec>d</rec> e
        <rec>f</rec> g
    </root>



the output looks like:



    <root> a
        <rec>B</rec> c
        <rec>C</rec> e
        <rec>D</rec> g
    </root>



and the My::Filter::Uc got three sets of events like:



    start_document
    start_element: <rec>
    characters:    b
    end_element:   </rec>
    end_document

    start_document
    start_element: <rec>
    characters:    d
    end_element:   </rec>
    end_document

    start_document
    start_element: <rec>
    characters:   f
    end_element:   </rec>
    end_document



NAME

XML::SAX::ByRecord - Record oriented processing of (data) documents

METHODS

new


    my $d = XML::SAX::ByRecord->new( @channels, \%options );



Longhand for calling the ByRecord function exported by XML::SAX::Machines.

CREDIT

Proposed by Matt Sergeant, with advise by Kip Hampton and Robin Berjon.

Writing an aggregator.

To be written. Pretty much just that start_manifold_processing and end_manifold_processing need to be provided. See XML::Filter::Merger and it’s source code for a starter.

AUTHORS

o Barry Slaymaker
o Chris Prather <chris@prather.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by Barry Slaymaker.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 XML::SAX::BYRECORD (3) 2013-08-19

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.