GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  WWW::ROBOTRULES::PARSER (3)

.ds Aq ’

NAME

WWW::RobotRules::Parser - Just Parse robots.txt

CONTENTS

SYNOPSIS



  use WWW::RobotRules::Parser;
  my $p = WWW::RobotRules::Parser->new;
  $p->parse($robots_txt_uri, $text);

  $p->parse_uri($robots_txt_uri);



DESCRIPTION

WWW::RobotRules::Parser allows you to simply parse robots.txt files as described in http://www.robotstxt.org/wc/norobots.html. Unlike WWW::RobotRules (which is very cool), this module does not take into consideration your user agent name when parsing. It just parses the structure and returns a hash containing the whole set of rules. You can then use this to do whatever you like with it.

I mainly wrote this to store away the parsed data structure else where for later use, without having to specify an user agent.

METHODS

    new

Creates a new instance of WWW::RobotRules::Parser

parse($uri, CW$text)

Given the URI of the robots.txt file and its contents, parses the content and returns a data structure that looks like the following:



  {
     * => [ /private, /also_private ],
     Another UserAgent => [ /dont_look ]
  }



Where the key is the user agent name, and the value is an arrayref of all paths that are prohibited by that user agent

    parse_uri($uri)

Given the URI of the robots.txt file, retrieves and parses the file.

SEE ALSO

WWW::RobotRules

AUTHOR

Copyright (c) 2006-2007 Daisuke Maki <daisuke@endeworks.jp>

LICENSE

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

See http://www.perl.com/perl/misc/Artistic.html

Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 WWW::ROBOTRULES::PARSER (3) 2007-12-01

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.