GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
XML::Tiny(3) User Contributed Perl Documentation XML::Tiny(3)

XML::Tiny - simple lightweight parser for a subset of XML

XML::Tiny is a simple lightweight parser for a subset of XML

    use XML::Tiny qw(parsefile);
    open($xmlfile, 'something.xml);
    my $document = parsefile($xmlfile);

This will leave $document looking something like this:

    [
        {
            type   => 'e',
            attrib => { ... },
            name   => 'rootelementname',
            content => [
                ...
                more elements and text content
                ...
           ]
        }
    ]

The "parsefile" function is optionally exported. By default nothing is exported. There is no objecty interface.

This takes at least one parameter, optionally more. The compulsory parameter may be:
a filename
in which case the file is read and parsed;
a string of XML
in which case it is read and parsed. How do we tell if we've got a string or a filename? If it begins with "_TINY_XML_STRING_" then it's a string. That prefix is, of course, ignored when it comes to actually parsing the data. This is intended primarily for use by wrappers which want to retain compatibility with Ye Aunciente Perl. Normal users who want to pass in a string would be expected to use IO::Scalar.
a glob-ref or IO::Handle object
in which case again, the file is read and parsed.

The former case is for compatibility with older perls, but makes no attempt to properly deal with character sets. If you open a file in a character-set-friendly way and then pass in a handle / object, then the method should Do The Right Thing as it only ever works with character data.

The remaining parameters are a list of key/value pairs to make a hash of options:

fatal_declarations
If set to true, <!ENTITY...> and <!DOCTYPE...> declarations in the document are fatal errors - otherwise they are *ignored*.
no_entity_parsing
If set to true, the five built-in entities are passed through unparsed. Note that special characters in CDATA and attributes may have been turned into "&amp;", "&lt;" and friends.
strict_entity_parsing
If set to true, any unrecognised entities (ie, those outside the core five plus numeric entities) cause a fatal error. If you set both this and "no_entity_parsing" (but why would you do that?) then the latter takes precedence.

Obviously, if you want to maximise compliance with the XML spec, you should turn on fatal_declarations and strict_entity_parsing.

The function returns a structure describing the document. This contains one or more nodes, each being either an 'element' node or a 'text' mode. The structure is an arrayref which contains a single 'element' node which represents the document entity. The arrayref is redundant, but exists for compatibility with XML::Parser::EasyTree.

Element nodes are hashrefs with the following keys:

type
The node's type, represented by the letter 'e'.
name
The element's name.
attrib
A hashref containing the element's attributes, as key/value pairs where the key is the attribute name.
content
An arrayref of the element's contents. The array's contents is a list of nodes, in the order they were encountered in the document.

Text nodes are hashrefs with the following keys:

type
The node's type, represented by the letter 't'.
content
A scalar piece of text.

If you prefer a DOMmish interface, then look at XML::Tiny::DOM on the CPAN.

The "parsefile" function is so named because it is intended to work in a similar fashion to XML::Parser with the XML::Parser::EasyTree style. Instead of saying this:

  use XML::Parser;
  use XML::Parser::EasyTree;
  $XML::Parser::EasyTree::Noempty=1;
  my $p=new XML::Parser(Style=>'EasyTree');
  my $tree=$p->parsefile('something.xml');

you would say:

  use XML::Tiny;
  my $tree = XML::Tiny::parsefile('something.xml');

Any valid document that can be parsed like that using XML::Tiny should produce identical results if you use the above example of how to use XML::Parser::EasyTree.

If you find a document where that is not the case, please report it as a bug.

The module is intended to be fully compatible with every version of perl back to and including 5.004, and may be compatible with even older versions of perl 5.

The lack of Unicode and friends in older perls means that XML::Tiny does nothing with character sets. If you have a document with a funny character set, then you will need to open the file in an appropriate mode using a character-set-friendly perl and pass the resulting file handle to the module. BOMs are ignored.

Element tags and attributes
Including "self-closing" tags like <pie type = 'steak n kidney' />;
Comments
Which are ignored;
The five "core" entities
ie "&amp;", "&lt;", "&gt;", "&apos;" and "&quot;";
Numeric entities
eg "&#65;" and "&#x41;";
CDATA
This is simply turned into PCDATA before parsing. Note how this may interact with the various entity-handling options;

The following parts of the XML standard are handled incorrectly or not at all - this is not an exhaustive list:

Namespaces
While documents that use namespaces will be parsed just fine, there's no special treatment of them. Their names are preserved in element and attribute names like 'rdf:RDF'.
DTDs and Schemas
This is not a validating parser. <!DOCTYPE...> declarations are ignored if you've not made them fatal.
Entities and references
<!ENTITY...> declarations are ignored if you've not made them fatal. Unrecognised entities are ignored by default, as are naked & characters. This means that if entity parsing is enabled you won't be able to tell the difference between "&amp;nbsp;" and "&nbsp;". If your document might use any non-core entities then please consider using the "no_entity_parsing" option, and then use something like HTML::Entities.
Processing instructions
These are ignored.
Whitespace
We do not guarantee to correctly handle leading and trailing whitespace.
Character sets
This is not practical with older versions of perl

While feedback from real users about this module has been uniformly positive and helpful, some people seem to take issue with this module because it doesn't implement every last jot and tittle of the XML standard and merely implements a useful subset. A very useful subset, as it happens, which can cope with common light-weight XML-ish tasks such as parsing the results of queries to the Amazon Web Services. Many, perhaps most, users of XML do not in fact need a full implementation of the standard, and are understandably reluctant to install large complex pieces of software which have many dependencies. In fact, when they realise what installing and using a full implementation entails, they quite often don't *want* it. Another class of users, people distributing applications, often can not rely on users being able to install modules from the CPAN, or even having tools like make or a shell available. XML::Tiny exists for those people.

I welcome feedback about my code, including constructive criticism. Bug reports should be made using <http://rt.cpan.org/> or by email, and should include the smallest possible chunk of code, along with any necessary XML data, which demonstrates the bug. Ideally, this will be in the form of a file which I can drop in to the module's test suite. Please note that such files must work in perl 5.004.

For more capable XML parsers:
XML::Parser

XML::Parser::EasyTree

XML::Tiny::DOM

The requirements for a module to be Tiny
<http://beta.nntp.perl.org/group/perl.datetime/2007/01/msg6584.html>

David Cantrell <david@cantrell.org.uk>

Thanks to David Romano for some compatibility patches for Ye Aunciente Perl;

to Matt Knecht and David Romano for prodding me to support attributes, and to Matt for providing code to implement it in a quick n dirty minimal kind of way;

to the people on <http://use.perl.org/> and elsewhere who have been kind enough to point out ways it could be improved;

to Sergio Fanchiotti for pointing out a bug in handling self-closing tags, for reporting another bug that I introduced when fixing the first one, and for providing a patch to improve error reporting;

to 'Corion' for finding a bug with localised filehandles and providing a fix;

to Diab Jerius for spotting that element and attribute names can begin with an underscore;

to Nick Dumas for finding a bug when attribs have their quoting character in CDATA, and providing a patch;

to Mathieu Longtin for pointing out that BOMs exist.

Copyright 2007-2010 David Cantrell <david@cantrell.org.uk>

This software is free-as-in-speech software, and may be used, distributed, and modified under the terms of either the GNU General Public Licence version 2 or the Artistic Licence. It's up to you which one you use. The full text of the licences can be found in the files GPL2.txt and ARTISTIC.txt, respectively.

This module is also free-as-in-mason software.
2017-08-17 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.