Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Contact Us
Online Help
Domain Status
Man Pages

Virtual Servers

Topology Map

Server Agreement
Year 2038

USA Flag



Man Pages

Manual Reference Pages  -  W3C::LOGVALIDATOR (3)

.ds Aq ’


W3C::LogValidator - The W3C Log Validator - Quality-focused Web Server log processing engine

Checks quality/validity of most popular content on a Web server



W3C::LogValidator is the main module for the W3C Log Validator, a combination of Web Server log analysis and statistics tool and Web Content quality checker.

The W3C::LogValidator can batch-process a number of documents through a number of quality focus checks, such as HTML or CSS validation, or checking for broken links. It can take a number of different inputs, ranging from a simple list of URIs to log files from various Web servers. And since it orders the result depending on the number of times a document appears in the file or logs, it is, in practice, a useful way to spot the most popular documents that need work.

the perl script, bundled in the W3C::LogValidator distribution, is a simple way to use the features of W3C::LogValidator. Developers can also use W3C::LogValidator can be used as a perl module to build applications.

The homepage for the Log Validator is at:


The simple way to use is to edit the sample configuration file (samples/logprocess.conf) and to run the bundled script with this configuration file, a la: -f /path/to/logprocess.conf

The basic task of the W3C::LogValidator module is to parse a configuration file and process relevant logs, passed through a configuration file argument:

    use W3C::LogValidator;
    my $logprocessor = W3C::LogValidator->new("sample.conf");

Alternatively, it will use default a default config and try to process Web server logs in well known locations:

    my $logprocessor = W3C::LogValidator->new;



$processor = W3C::LogValidator->new Constructs a new W3C::LogValidator processor. You might pass a configuration file name, as well as a hash of attribute-value pairs as parameters to the constructor.

e.g. for mail output:

  %conf = (
    "UseOutputModule" => "W3C::LogValidator::Output::Mail",
    "ServerAdmin" =>,
    "verbose" => "3"
  $processor = W3C::LogValidator->new("path/to/config.conf", \%conf);

Or e.g. for HTML output:

  %conf = (
    "UseOutputModule" => "W3C::LogValidator::Output::HTML",
    "OutputTo" => path/to/file.html,
    "verbose" => "0"
  $processor = W3C::LogValidator->new("path/to/config.conf", \%conf);

If given the path to a configuration file, new() will call the W3C::LogValidator::Config module to get its configuration variables. Otherwise, a default set of values is used.

    Main processing method

$processor->process =item $processor->find_remote_addr Given a log record and the type of the log (common log format, flat list of URIs, etc), extracts the remote host or ip

Do-it-all method: Read configuration file (if any), parse log files, run them through processing modules, send result to output module.

    Modules methods

$processor->config_module Creates a configuration hash for a specific module, adding module-specific configuration variables, overriding if necessary
$processor->use_modules Run the data parsed off the log files through the various processing (validation) modules specified by UseValidationModule in the configuration.

    Log parsing and URI methods

$processor->read_logfiles Loops through and parses all log files specified in the configuration
$processor->read_logfile(’path/to.file’) Extracts URIs and number of hits from a given log file, and feeds it to the processor’s URI/Hits table
$processor->find_uri Given a log record and the type of the log (common log format, flat list of URIs, etc), extracts the URI
$processor->remove_duplicates Given a URI, removes directory index suffixes such as index.html, etc so that http://foobar/ and http://foobar/index.html be counted as one resource
$processor->add_uri Add a URI to the processor’s URI/Hits table
$processor->sorted_uris Returns the list of URIs in the processor’s table, sorted by popularity (hits)
$processor->no_cgi Tests whether a given URI contains a CGI query string
$processor->hit Returns the number of hits for a given URI. Basically a public method accessing $hits{$uri};


Public bug-tracking interface at


Olivier Thereaux <> for The World Wide Web Consortium


Up-to-date information on the Log Validator at:

    Articles and Tutorials

Several articles have been written within the W3C Quality Assurance Interest Group on the topic of improving the quality of Web sites, notably by using a step-by-step approach and relying upon the Log Validator to help find the areas to fix in priority.
My Web site is standard! And yours? Available at
Web Standards Switch or how to improve your Web site easily.

Available in several languages at:

Making your website valid: a step by step guide. Available at
Search for    or go to Top of page |  Section 3 |  Main Index

perl v5.20.3 W3C::LOGVALIDATOR (3) 2008-11-18

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.