GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  HTML::TABLECONTENTPARSER (3)

.ds Aq ’

NAME

HTML::TableContentParser - Do interesting things with the contents of tables.

CONTENTS

SYNOPSIS



  use HTML::TableContentParser;
  $p = HTML::TableContentParser->new();
  $tables = $p->parse($html);



DESCRIPTION

This package pulls out the contents of a table from a string containing HTML. Each time a table is encountered, data will be stored in an array consisting of a hash of whatever was discovered about the table — id, name, border, cellspacing etc, and of course data contained within the table.

The format of each hash will look something like



  attributes            keys from the attributes of the <table> tag
  @{$table_headers}     array of table headers, in order found
  @{$table_rows}        rows discovered, in order



If the table has a caption, this will be provided as



  caption               keys from the caption tags attributes
    data                the text of the <caption>..</caption> element



then for each table row,
@{$table_data} td’s found, in order
other attributes the ... in <tr ...>

then for each data cell,
data what comes between <td> and </td>
other attributes the ... in <td ...>

    EXAMPLE



  use HTML::TableContentParser;
  $p = HTML::TableContentParser->new();
        $html = read_html_from_somewhere();
  $tables = $p->parse($html);
  for $t (@$tables) {
    for $r (@{$t->{rows}}) {
                        print "Row: ";
      for $c (@{$r->{cells}}) {
        print "[$c->{data}] ";                         
      }                        
      print "\n";                      
    }
  }



METHODS

start($parser, $tag, $attr, $attrseq, $origtext); Called whenever a particular start tag has been recognised. This is called automatically by the parser and should not be called from the application.
text($parser, $content); Called whenever a piece of content is encountered. This is called automatically by the parser and should not be called from the application.
end($parser, $tag, $origtext); Called whenever a particular end tag is encountered. This is called automatically by the parser and should not be called from the application.
$tables_ref = $p->parse($html); Called with the HTML to parse. This is all the application needs to do. The return value will be an arrayref containing each table encountered, in the format detailed above.
DEBUG Not a method, but a class variable. Set to 1 to cause debugging output (basically the structure and content of the table) to be sent to stdout via warn().

    EXPORTS

Nothing.

    CAVEATS, BUGS, and TODO

AUTHOR



  Simon Drabble  E<lt>sdrabble@cpan.orgE<gt>

  (C) 2002  Simon Drabble



This software is released under the same terms as perl.

Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 TABLECONTENTPARSER (3) 2002-07-14

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.