GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
HTML::Copy(3) User Contributed Perl Documentation HTML::Copy(3)

HTML::Copy - copy a HTML file without breaking links.

Version 1.31

  use HTML::Copy;

  HTML::Copy->htmlcopy($source_path, $destination_path);

  # or

  $p = HTML::Copy->new($source_path);
  $p->copy_to($destination_path);

  # or

  open my $in, "<", $source_path;
  $p = HTML::Copy->new($in)
  $p->source_path($source_path);    # can be omitted, 
                                    # when $source_path is in cwd.

  $p->destination_path($destination_path) # can be omitted, 
                                          # when $source_path is in cwd.
  open my $out, ">", $source_path;
  $p->copy_to($out);

This module is to copy a HTML file without beaking links in the file. This module is a sub class of HTML::Parser.

HTML::Parser

    HTML::Copy->htmlcopy($source_path, $destination_path);

Parse contents of $source_path, change links and write into $destination_path.

    $html_text = HTML::Copy->parse_file($source_path, 
                                        $destination_path);

Parse contents of $source_path and change links to copy into $destination_path. But don't make $destination_path. Just return modified HTML. The encoding of strings is converted into utf8.

    $p = HTML::Copy->new($source);

Make an instance of this module with specifying a source of HTML.

The argument $source can be a file path or a file handle. When a file handle is passed, you may need to indicate a file path of the passed file handle by the method "source_path". If calling "source_path" is omitted, it is assumed that the location of the file handle is the current working directory.

    $p->copy_to($destination)

Parse contents of $source given in new method, change links and write into $destination.

The argument $destination can be a file path or a file handle. When $destination is a file handle, you may need to indicate the location of the file handle by a method "destination_path". "destination_path" must be called before calling "copy_to". When calling "destination_path" is omitted, it is assumed that the locaiton of the file handle is the current working directory.

    $p->parse_to($destination_path)

Parse contents of $source_path given in new method, change links and return HTML contents to wirte $destination_path. Unlike copy_to, $destination_path will not created and just return modified HTML. The encoding of strings is converted into utf8.

    $p->source_path
    $p->source_path($path)

Get and set a source location. Usually source location is specified with the "new" method. When a file handle is passed to "new" and the location of the file handle is not the current working directory, you need to use this method.

    $p->destination_path
    $p->destination_path($path)

Get and set a destination location. Usually destination location is specified with the "copy_to". When a file handle is passed to "copy_to" and the location of the file handle is not the current working directory, you need to use this method before "copy_to".

    $p->encoding;

Get an encoding of a source HTML.

    $p->io_layer;
    $p->io_layer(':utf8');

Get and set PerlIO layer to read the source path and to write the destination path. Usually it was automatically determined by $source_path's charset tag. If charset is not specified, Encode::Guess module will be used.

    @suspects = $p->encode_sustects;
    $p->encode_suspects(qw/shiftjis euc-jp/);

Add suspects of text encoding to guess the text encoding of the source HTML. If the source HTML have charset tag, it is not required to add suspects.

    $p->source_html;

Obtain source HTML's contents

Cleanuped pathes should be given to HTML::Copy and it's instances. For example, a verbose path like '/aa/bb/../cc' may cause converting links wrongly. This is a limitaion of the URI module's rel method. To cleanup pathes, Cwd::realpath is useful.

Tetsuro KURITA <tkurita@mac.com>
2013-06-21 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.