GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  HTML::FORMATTEXT::HTML2TEXT (3)

.ds Aq ’

NAME

HTML::FormatText::Html2text - format HTML as plain text using html2text

CONTENTS

SYNOPSIS



 use HTML::FormatText::Html2text;
 $text = HTML::FormatText::Html2text->format_file ($filename);
 $text = HTML::FormatText::Html2text->format_string ($html_string);

 $formatter = HTML::FormatText::Html2text->new;
 $tree = HTML::TreeBuilder->new_from_file ($filename);
 $text = $formatter->format ($tree);



DESCRIPTION

HTML::FormatText::Html2text turns HTML into plain text using the html2text program.

<http://www.mbayer.de/html2text/>

The module interface is compatible with formatters like HTML::FormatText, but all parsing etc is done by html2text.

See HTML::FormatExternal for the formatting functions and options, with the following caveats,
input_charset Currently this option has no effect. Input generally has to be latin-1 only, though the Debian extended html2ext interprets a <meta> charset directive in the HTML header.

Various & style named or numbered entities are recognised and result in suitable output. The suggestion would be entitized input for maximum portability among html2text versions.

output_charset If set to ascii or ANSI_X3.4-1968 (both case-insensitive) the html2text -ascii option is used, when available (html2text 1.3.2 from Jan 2004).

If set to UTF-8 then Debian extension -utf8 option is used (circa 2009).

Apart from this there’s no control over the output charset.

SEE ALSO

HTML::FormatExternal, html2text(1)

HOME PAGE

<http://user42.tuxfamily.org/html-formatexternal/index.html>

LICENSE

Copyright 2008, 2009, 2010, 2013, 2015 Kevin Ryde

HTML-FormatExternal is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version.

HTML-FormatExternal is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with HTML-FormatExternal. If not, see <http://www.gnu.org/licenses/>.

Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 HTML::FORMATTEXT::HTML2TEXT (3) 2015-08-06

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.