|
NAMEHTML::DOM - A Perl implementation of the HTML Document Object Model VERSIONVersion 0.058 (alpha) WARNING: This module is still at an experimental stage. The API is subject to change without notice. SYNOPSIS use HTML::DOM;
my $dom_tree = new HTML::DOM; # empty tree
$dom_tree->write($source_code);
$dom_tree->close;
my $other_dom_tree = new HTML::DOM;
$other_dom_tree->parse_file($filename);
$dom_tree->getElementsByTagName('body')->[0]->appendChild(
$dom_tree->createElement('input')
);
print $dom_tree->innerHTML, "\n";
my $text = $dom_tree->createTextNode('text');
$text->data; # get attribute
$text->data('new value'); # set attribute
DESCRIPTIONThis module implements the HTML Document Object Model by extending the HTML::Tree modules. The HTML::DOM class serves both as an HTML parser and as the document class. The following DOM modules are currently supported: Feature Version (aka level) ------- ------------------- HTML 2.0 Core 2.0 Events 2.0 UIEvents 2.0 MouseEvents 2.0 MutationEvents 2.0 HTMLEvents 2.0 StyleSheets 2.0 CSS 2.0 (partially) CSS2 2.0 Views 2.0 StyleSheets, CSS and CSS2 are actually provided by CSS::DOM. This list corresponds to CSS::DOM versions 0.02 to 0.14. METHODSConstruction and Parsing
If "referrer" and "url" are omitted, they can be inferred from "response".
Other DOM Methods
Other (Non-DOM) Methods(See also "EVENT HANDLING", below.)
HASH ACCESSYou can use an HTML::DOM object as a hash ref to access it's form elements by name. So "$doc->{yayaya}" is short for "$doc->forms->{yayaya}". EVENT HANDLINGHTML::DOM supports both the DOM Level 2 event model and the HTML 4 event model. Throughout this documentation, we make use of HTML 5's distinction between handlers and listeners: An event handler is the result of an HTML element beginning with 'on', e.g. onsubmit. These are also accessible via the DOM. (We also use the word 'handler' in other contexts, such as the 'default event handler'.) Event listeners are registered solely with the "addEventListener" method and can be removed with "removeEventListener". HTML::DOM accepts as an event handler a coderef, an object with a "call_with" method, or an object with "&{}" overloading. If the "call_with" method is present, it is called with the current event target as the first argument and the event object as the second. This is to allow for objects that wrap JavaScript functions (which must be called with the event target as the this value). An event listener is a coderef, an object with a "handleEvent" method or an object with "&{}" overloading. HTML::DOM does not implement any classes that provide a "handleEvent" method, but will support any object that has one. Listeners and handlers differ in one important aspect. A listener has to call "preventDefault" on the event object to cancel the default action. A handler simply returns a defined false value (except for mouseover events, which must return a true value to cancel the default). Default ActionsDefault actions that HTML::DOM is capable of handling internally (such as triggering a DOMActivate event when an element is clicked, and triggering a form's submit event when the submit button is activated) are dealt with automatically. You don't have to worry about those. For others, read on.... To specify the default actions associated with an event, provide a subroutine (in this case, it not being part of the DOM, you can't use an object with a "handleEvent" method) via the "default_event_handler_for" and "default_event_handler" methods. With the former, you can specify the default action to be taken when a particular type of event occurs. The currently supported types are: submit when a form is submitted link called when a link is activated (DOMActivate event) Pass the type of event as the first argument and a code ref as the second argument. When the code ref is called, its sole argument will be the event object. For instance: $dom_tree->default_event_handler_for( link => sub {
my $event = shift;
go_to( $event->target->href );
});
sub go_to { ... }
"default_event_handler_for" with just one argument returns the currently assigned coderef. With two arguments it returns the old one after assigning the new one. Use "default_event_handler" (without the "_for") to specify a fallback subroutine that will be used for events not in the list above, and for events in the list above that do not have subroutines assigned to them. Without any arguments it will return the currently assigned coderef. With an argument it will return the old one after assigning the new one. Dispatching EventsHTML::DOM::Node's "dispatchEvent" method triggers the appropriate event listeners, but does not call any default actions associated with it. The return value is a boolean that indicates whether the default action should be taken. H:D:Node's "trigger_event" method will trigger the event for real. It will call "dispatchEvent" and, provided it returns true, will call the default event handler. HTML Event AttributesThe "event_attr_handler" can be used to assign a coderef that will turn text assigned to an event attribute (e.g., "onclick") into an event handler. The arguments to the routine will be (0) the element, (1) the name (aka type) of the event (without the initial 'on'), (2) the value of the attribute and (3) the offset within the source of the attribute's value. (Actually, if the value is within quotes, it is the offset of the first quotation mark. Also, it will be "undef" for generated HTML [source code passed to the "write" method by an element handler].) As with "default_event_handler", you can replace an existing handler with a new one, in which case the old handler is returned. If you call this method without arguments, it returns the current handler. Here is an example of its use, that assumes that handlers are Perl code: $dom_tree->event_attr_handler(sub {
my($elem, $name, $code, $offset) = @_;
my $sub = eval "sub { $code }";
return sub {
local *_ = \$elem;
&$sub;
};
});
The event attribute handler will be called whenever an element attribute whose name begins with 'on' (case-tolerant) is modified. (For efficiency's sake, I may change it to call the event attribute handler only when the event is triggered, so it is not called unnecessarily.) When an Event Handler DiesUse "error_handler" to assign a coderef that will be called whenever an event listener (or handler) raises an error. The error will be contained in $@. Other Event-Related Methods
CLASSES AND DOM INTERFACESHere are the inheritance hierarchy of HTML::DOM's various classes and the DOM interfaces those classes implement. The classes in the left column all begin with 'HTML::DOM::', which is omitted for brevity, except for HTML::DOM itself, which is listed with its full name. Items in brackets have not yet been implemented. (See also HTML::DOM::Interface for a machine-readable list of standard methods.) Class Inheritance Hierarchy Interfaces
--------------------------- ----------
Exception DOMException, EventException
Implementation DOMImplementation,
[DOMImplementationCSS]
Node Node, EventTarget
DocumentFragment DocumentFragment
HTML::DOM Document, HTMLDocument,
DocumentEvent, DocumentView,
DocumentStyle, [DocumentCSS]
CharacterData CharacterData
Text Text
Comment Comment
Element Element, HTMLElement,
ElementCSSInlineStyle
Element::HTML HTMLHtmlElement
Element::Head HTMLHeadElement
Element::Link HTMLLinkElement, LinkStyle
Element::Title HTMLTitleElement
Element::Meta HTMLMetaElement
Element::Base HTMLBaseElement
Element::IsIndex HTMLIsIndexElement
Element::Style HTMLStyleElement, LinkStyle
Element::Body HTMLBodyElement
Element::Form HTMLFormElement
Element::Select HTMLSelectElement
Element::OptGroup HTMLOptGroupElement
Element::Option HTMLOptionElement
Element::Input HTMLInputElement
Element::TextArea HTMLTextAreaElement
Element::Button HTMLButtonElement
Element::Label HTMLLabelElement
Element::FieldSet HTMLFieldSetElement
Element::Legend HTMLLegendElement
Element::UL HTMLUListElement
Element::OL HTMLOListElement
Element::DL HTMLDListElement
Element::Dir HTMLDirectoryElement
Element::Menu HTMLMenuElement
Element::LI HTMLLIElement
Element::Div HTMLDivElement
Element::P HTMLParagraphElement
Element::Heading HTMLHeadingElement
Element::Quote HTMLQuoteElement
Element::Pre HTMLPreElement
Element::Br HTMLBRElement
Element::BaseFont HTMLBaseFontElement
Element::Font HTMLFontElement
Element::HR HTMLHRElement
Element::Mod HTMLModElement
Element::A HTMLAnchorElement
Element::Img HTMLImageElement
Element::Object HTMLObjectElement
Element::Param HTMLParamElement
Element::Applet HTMLAppletElement
Element::Map HTMLMapElement
Element::Area HTMLAreaElement
Element::Script HTMLScriptElement
Element::Table HTMLTableElement
Element::Caption HTMLTableCaptionElement
Element::TableColumn HTMLTableColElement
Element::TableSection HTMLTableSectionElement
Element::TR HTMLTableRowElement
Element::TableCell HTMLTableCellElement
Element::FrameSet HTMLFrameSetElement
Element::Frame HTMLFrameElement
Element::IFrame HTMLIFrameElement
NodeList NodeList
NodeList::Radio
NodeList::Magic NodeList
NamedNodeMap NamedNodeMap
Attr Node, Attr, EventTarget
Collection HTMLCollection
Collection::Elements
Collection::Options
Event Event
Event::UI UIEvent
Event::Mouse MouseEvent
Event::Mutation MutationEvent
View AbstractView, ViewCSS
The EventListener interface is not implemented by HTML::DOM, but is supported. See "EVENT HANDLING", above. Not listed above is HTML::DOM::EventTarget, which is a base class both for HTML::DOM::Node and HTML::DOM::Attr. The format I'm using above doesn't allow for multiple inheritance, so I probably need to redo it. HTML::DOM::Node also implements the HTML::Element interface, but with a few differences. In particular:
IMPLEMENTATION NOTES
ACKNOWLEDGEMENTSMuch of the code was stolen from HTML::Tree. In fact, HTML::DOM used to extend HTML::Tree, but the two were merged to allow a whole pile of hacks to be removed. PREREQUISITESperl 5.8.3 or later Exporter 5.57 or later URI.pm LWP 5.13 or later CSS::DOM 0.06 or later Scalar::Util 1.14 or later HTML::Tagset 3.02 or later HTML::Parser 3.46 or later HTML::Encoding is required if a file name is passed to "parse_file". Tie::RefHash::Weak 0.08 or higher, if you are using perl 5.8.x BUGS
To report bugs, please e-mail the author. AUTHOR, COPYRIGHT & LICENSECopyright (C) 2007-16 Father Chrysostomos $text = new HTML::DOM ->createTextNode('sprout');
$text->appendData('@');
$text->appendData('cpan.org');
print $text->data, "\n";
This program is free software; you may redistribute it and/or modify it under the same terms as perl. SEE ALSOEach of the classes listed above "CLASSES AND DOM INTERFACES" HTML::DOM::Exception, HTML::DOM::Node, HTML::DOM::Event, HTML::DOM::Interface HTML::Tree, HTML::TreeBuilder, HTML::Element, HTML::Parser, LWP, WWW::Mechanize, HTTP::Cookies, WWW::Mechanize::Plugin::JavaScript, HTML::Form, HTML::Encoding The DOM Level 1 specification at <http://www.w3.org/TR/REC-DOM-Level-1> The DOM Level 2 Core specification at <http://www.w3.org/TR/DOM-Level-2-Core> The DOM Level 2 Events specification at <http://www.w3.org/TR/DOM-Level-2-Events> etc. POD ERRORSHey! The above document had some coding errors, which are explained below:
|