GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  HTML_FMT (1)

.ds Aq ’

NAME

"html_fmt" - Reformat HTML, indented according to structure

CONTENTS

SYNOPSIS



    html_fmt [uri|file]



EXAMPLE



    html_fmt http://perl.org



DESCRIPTION

Given the URI or the name of a file, writes it to STDOUT reformatted and indented according to the HTML structure. Missing start and end tags are supplied and comments added to indicate this. Text inside <pre> elements is not altered.

html_fmt tries to parse everything that is actually out there on the Web. In fact, html_fmt will assume any file fed to it was intended as HTML, and will produce its best guess of the author’s intent.

html_fmt supplies missing start and end tags. html_fmt’s parser is extremely liberal in what it accepts. When its liberalization of the standards is not sufficient to make a document into valid HTML, html_fmt will pick characters to treat as noise or cruft. The parser ignores cruft in determining the structure of the document.

When html_fmt adds a missing start tag, it precedes the new start tag with a comment. When html_fmt adds a missing end tag, it follows the new end tag with a comment. When html_fmt classifies characters as cruft, it adds a comment to that effect before the cruft.

pre elements receive special treatment. The contents of pre elements are not reformatted. When missing tags or cruft occur inside a pre element, the comments to that effect are placed before the <pre> start tag.

The argument to html_score can be either as a URI or a file name. If it starts with alphanumerics followed by a colon, it is treated as a URI. Otherwise it is treated as file name.

SAMPLE OUTPUT

Given this input:



    <title>Test page<tr>x<head attr="I am cruft"><p>Final graf



html_fmt returns



    <!-- Following start tag is replacement for a missing one -->
    <html>
      <!-- Following start tag is replacement for a missing one -->
      <head>
        <title>
          Test page
        </title>
        <!-- Preceding end tag is replacement for a missing one -->
      </head>
      <!-- Preceding end tag is replacement for a missing one -->
      <!-- Following start tag is replacement for a missing one -->
      <body>
        <!-- Following start tag is replacement for a missing one -->
        <table>
          <!-- Following start tag is replacement for a missing one -->
          <tbody>
            <tr>
              <!-- Following start tag is replacement for a missing one -->
              <td>
                x
                <!-- Next line is cruft -->
                <head attr="I am cruft">
                <p>
                  Final graf
                </p>
                <!-- Preceding end tag is replacement for a missing one -->
              </td>
              <!-- Preceding end tag is replacement for a missing one -->
            </tr>
            <!-- Preceding end tag is replacement for a missing one -->
          </tbody>
          <!-- Preceding end tag is replacement for a missing one -->
        </table>
        <!-- Preceding end tag is replacement for a missing one -->
      </body>
      <!-- Preceding end tag is replacement for a missing one -->
    </html>
    <!-- Preceding end tag is replacement for a missing one -->



PURPOSE

This program is a demo of a demo. It purpose is to show how easy it is to write applications which look at the structure of web pages using Marpa::HTML. And the purpose of Marpa::HTML is to demonstrate the power of its parse engine, Marpa. Marpa::HTML was written in a few days, and its logic is a straightforward, natural expression of the structure of HTML.

ACKNOWLEDGMENTS

The starting template for this code was HTML::TokeParser, by Gisle Aas. See also the acknowledgments for Marpa as a whole.

LICENSE AND COPYRIGHT

Copyright 2007-2010 Jeffrey Kegler, all rights reserved. Marpa is free software under the Perl license. For details see the LICENSE file in the Marpa distribution.
Search for    or go to Top of page |  Section 1 |  Main Index


perl v5.20.3 HTML_FMT (1) 2016-04-05

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.