GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Data::Sah(3) User Contributed Perl Documentation Data::Sah(3)

Data::Sah - Fast and featureful data structure validation

This document describes version 0.917 of Data::Sah (from Perl distribution Data-Sah), released on 2024-02-16.

Non-OO interface:

 use Data::Sah qw(
     normalize_schema
     gen_validator
 );
 my $v;
 # generate a validator for schema
 $v = gen_validator(["int*", min=>1, max=>10]);
 # validate your data using the generated validator
 say "valid" if $v->(5);     # valid
 say "valid" if $v->(11);    # invalid
 say "valid" if $v->(undef); # invalid
 say "valid" if $v->("x");   # invalid
 # generate validator which reports error message string
 $v = gen_validator(["int*", min=>1, max=>10],
                    {return_type=>'str_errmsg', lang=>'id_ID'});
 # ditto but the error message will be in Indonesian
 $v = gen_validator(["int*", min=>1, max=>10],
                    {return_type=>'str_errmsg', lang=>'id_ID'});
 say $v->(5);  # ''
 say $v->(12); # 'Data tidak boleh lebih besar dari 10'
               # (in English: 'Data must not be larger than 10')
 # normalize a schema
 my $nschema = normalize_schema("int*"); # => ["int", {req=>1}, {}]
 normalize_schema(["int*", min=>0]); # => ["int", {min=>0, req=>1}, {}]

OO interface (more advanced usage):

 use Data::Sah;
 my $sah = Data::Sah->new;
 # get perl compiler
 my $pl = $sah->get_compiler("perl");
 # compile schema into Perl code
 my $cd = $pl->compile(schema => ["int*", min=>0]);
 say $cd->{result};

will print something like:

 # req #0
 (defined($data))
 &&
 # check type 'int'
 (Scalar::Util::Numeric::isint($data))
 &&
 (# clause: min
 ($data >= 0))

To see the full validator code (with "sub {}" and all), you can do something like:

 % LOG_SAH_VALIDATOR_CODE=1 TRACE=1 perl -MLog::ger::LevelFromEnv -MLog::ger::Output=Screen -MData::Sah=gen_validator -E'gen_validator(["int*", min=>0])'

which will print log message like:

 normalized schema=['int',{min => 0,req => 1},{}]
 validator code:
    1|do {
    2|    require Scalar::Util::Numeric;
    3|    sub {
    4|        my ($data) = @_;
    5|        my $_sahv_res =
     |
    7|            # req #0
    8|            (defined($data))
     |
   10|            &&
     |
   12|            # check type 'int'
   13|            (Scalar::Util::Numeric::isint($data))
     |
   15|            &&
     |
   17|            (# clause: min
   18|            ($data >= 0));
     |
   20|        return($_sahv_res);
   21|    }}

This distribution, "Data-Sah", implements compilers for producing Perl and JavaScript validators, as well as translatable human description text from Sah schemas. Compiler approach is used instead of interpreter for faster speed.

The generated validator code can run without the "Data::Sah::*" modules.

Some features are not implemented yet:

  • def/subschema
  • obj: meths, attrs properties
  • .prio, .err_msg, .ok_err_msg attributes
  • .result_var attribute
  • BaseType: more forms of if clause

    Only the basic form of the "if" clause is implemented.

  • BaseType: postfilters
  • BaseType: prefilters.temp
  • BaseType: check, prop, check_prop clauses
  • HasElems: each_index, check_each_elem, check_each_index, exists clauses
  • HasElems: len, elems, indices properties
  • hash: check_each_key, check_each_value, allowed_keys_re, forbidden_keys_re clauses
  • array: uniq clauses
  • human compiler: markdown output

Data::Sah::Type::* roles specify Sah types, e.g. "Data::Sah::Type::bool" specifies the bool type. It can also be used to name distributions that introduce new types, e.g. "Data-Sah-Type-complex" which introduces complex number type.

Data::Sah::FuncSet::* roles specify bundles of functions, e.g. <Data::Sah::FuncSet::Core> specifies the core/standard functions.

Data::Sah::Compiler::$LANG:: namespace is for compilers. Each compiler might further contain "::TH::*" (type handler) and "::FSH::*" (function handler) subnamespaces to implement appropriate functionalities, e.g. Data::Sah::Compiler::perl::TH::bool is the bool type handler for the Perl compiler, Data::Sah::Compiler::perl::FSH::Core is the Core funcset handler for Perl compiler.

Data::Sah::Coerce::$LANG::To_$TARGET_TYPE::From_$SOURCE_TYPE::$DESCRIPTION contains coercion rules.

Data::Sah::Filter::$LANG::$TOPIC::$DESCRIPTION contains filtering rules.

Data::Sah::Value::$LANG::$TOPIC::$DESCRIPTION contains value codes.

Data::Sah::TypeX::$TYPENAME::$CLAUSENAME namespace can be used to name distributions that extend an existing Sah type by introducing a new clause for it. See Data::Sah::Manual::Extending for an example.

Data::Sah::Lang::$LANGCODE namespaces are for modules that contain translations. They are further organized according to the organization of other Data::Sah modules, e.g. Data::Sah::Lang::en_US::Type::int or "Data::Sah::Lang::en_US::TypeX::str::is_palindrome".

Sah::Schema:: namespace is reserved for modules that contain schemas in their $schema package variables. For example, Sah::Schema::posint.

Sah::Schemas::* are module names for distributions that bundle several "Sah::Schema::*" modules. For example Sah::Schemas::Int contains various schemas for integers such as Sah::Schema::uint, Sah::Schema::int8, and so on.

Sah::SchemaR:: namespace is reserved to store resolved result of schema. For example, Sah::Schema::unix::local_username contains the definition for the schema "unix::local_username" which is "unix::username" with some additional coerce rules. "unix::username" in turn is defined in Sah::Schema::unix::username which is base type "str" with some clauses like minimum and maximum length as well as regular expression for valid pattern. To find out the base type of a schema (which might be defined based on another schema), one has to perform one to several lookups to "Sah::Schema::*" modules. A "Sah::SchemaR::*" module, however, contains the "resolved" result of the definition, so by looking at Sah::SchemaR::unix::local_username one can know that the schema eventually is based on the base type "str". See Dist::Zilla::Plugin::Sah::Schemas.

Sah::SchemaV:: namespace is reserved to store generated schema validator code. See Dist::Zilla::Plugin::Rinci::GenValidator.

None exported by default.

Usage:

 # as function
 my $nclset = Data::Sah::normalize_clset($clset[, \%opts]); # => hash

Convert a clause set to its normalized form, e.g. change "{"!match"=>"abc"}" into "{"match"=>"abc", "match.op"=>"not"}". Produce a shallow copy of the input clause set hash.

Usage:

 # as function
 my $nschema = normalize_schema($schema); # => ARRAY

Convert $schema to its normalized form, i.e. the two-element array form. See Sah for more information about schema forms. Produces a new copy of arrayref as well as clause set hashref, even if the input $schema is already in array form. Implemented by Data::Sah::Normalize.

Can also be used as a method, see "normalize_clset (method)".

Usage:

 $code_or_str = gen_validator($schema, \%opts); # => CODE (or STR)

Generate validator code for $schema (or, if "source" option is set to true, a source code string).

Can also be used as a method, see "gen_validator (method)".

Known options (unknown options will be passed to Perl schema compiler, see Data::Sah::Compiler::perl):

  • accept_ref => BOOL (default: 0)

    Normally the generated validator accepts data, as in:

     $res = $vdr->($data);
     $res = $vdr->(42);
        

    If this option is set to true, validator accepts reference to data instead, as in:

     $res = $vdr->(\$data);
        

    This allows $data to be modified by the validator (mainly, to set default value specified in schema). For example:

     my $data;
     my $vdr = gen_validator([int => {min=>0, max=>10, default=>5}],
                             {accept_ref=>1});
     my $res = $vdr->(\$data);
     say $res;  # => 1 (success)
     say $data; # => 5
        
  • source => BOOL (default: 0)

    If set to 1, return source code string instead of compiled subroutine. Usually only needed for debugging (but see also $Log_Validator_Code and "LOG_SAH_VALIDATOR_CODE" if you want to log validator source code).

A mapping of compiler name and compiler ("Data::Sah::Compiler::*") objects.

Usage:

 my $sah = Data::Sah->new;

Create a new Data::Sah instance.

Usage:

 my $comp = $sah->get_compiler($name);

Get compiler object. "Data::Sah::Compiler::$name" will be loaded first and instantiated if not already so. After that, the compiler object is cached.

Example:

 my $plc = $sah->get_compiler("perl"); # loads Data::Sah::Compiler::perl

Usage:

 # as method
 my $nschema = $sah->normalize_schema($schema);

See "normalize_schema (function)", see "normalize_schema (function)" for more details on arguments.

Usage:

 # as method
 my $nclset = $sah->normalize_clset($clset[, \%opts]); # => hash

Can also be used as function, see "normalize_clset (function)" for more details on arguments.

 my $nvarname = $sah->normalize_var($var);

Normalize a variable name in expression into its fully qualified/absolute form.

Not yet implemented (pending specification).

For example:

 [int => {min => 10, 'max=' => '2*$min'}]

$min in the above expression will be normalized as "schema:clauses.min".

 # as method
 my $vdr = $sah->gen_validator($schema [ , \%opts ]); # => coderef
 # as function
 my $vdr = gen_validator($schema [ , \%opts ]); # => coderef

Can also be used as a function, see "gen_validator (function)" for more details on arguments.

See also Sah::FAQ.

See Sah::FAQ.

You probably do not reuse the compiled schema, e.g. you continually destroy and recreate Data::Sah object, or repeatedly recompile the same schema. To gain the benefit of compilation, you need to keep the compiled result and use the generated Perl code repeatedly.

For example:

 // if first element is an integer, require the array to contain only integers,
 // otherwise require the array to contain only strings.
 ["array", {"min_len": 1, "of=": "[is_int($_[0]) ? 'int':'str']"}]

Currently no, Data::Sah does not support expression on clauses that contain other schemas. In other words, dynamically generated schemas are not supported. To support this, if the generated code needs to run independent of Data::Sah, it needs to contain the compiler code itself (or an interpreter) to compile or evaluate the generated schema.

However, an eval_schema() Sah function which uses Data::Sah can be trivially declared and target the Perl compiler.

Use the "source => 1" option in gen_validator().

If you use the OO interface, e.g.:

 # generate perl code
 my $cd = $plc->compile(schema=>..., ...);

then the generated code is in "$cd->{result}" and you can just print it.

If you generate validator using gen_validator(), you can set environment LOG_SAH_VALIDATOR_CODE or package variable $Log_Validator_Code to true and the generated code will be logged at trace level using Log::ger. The log can be displayed using, e.g., Log::ger::Output::Screen:

 % LOG_SAH_VALIDATOR_CODE=1 TRACE=1 \
   perl -MLog::ger::LevelFromEnv -MLog::ger::Output=Screen \
   -MData::Sah=gen_validator -e '$sub = gen_validator([int => min=>1, max=>10])'

Sample output:

 normalized schema=['int',{max => 10,min => 1},{}]
 schema already normalized, skipped normalization
 validator code:
    1|do {
    2|    require Scalar::Util::Numeric;
    3|    sub {
    4|        my ($data) = @_;
    5|        my $_sahv_res =
     |
    7|            # skip if undef
    8|            (!defined($data) ? 1 :
     |
   10|            (# check type 'int'
   11|            (Scalar::Util::Numeric::isint($data))
     |
   13|            &&
     |
   15|            (# clause: min
   16|            ($data >= 1))
     |
   18|            &&
     |
   20|            (# clause: max
   21|            ($data <= 10))));
     |
   23|        return($_sahv_res);
   24|    }}

Lastly, you can also use validate-with-sah CLI utility from the App::SahUtils distribution (use the "--show-code" option).

Pass the "return_type=>"str_errmsg"" to get an error message string on error, or "return_type=>"hash_details"" to get a hash of detailed error messages. Note also that the error messages are translateable (e.g. use "LANG" or "lang=>..." option. For example:

 my $v = gen_validator([int => between => [1,10]], {return_type=>"str_errmsg"});
 say "$_: ", $v->($_) for 1, "x", 12;

will output:

 1:
 "x": Input is not of type integer
 12: Must be between 1 and 10

If you pass "return_type=>"hash_details"" then the generated validator code can return a hashref containing all the errors (in the "errors" key) and warnings (in the "warnings" key) instead of just a boolean (when "return_type=>"bool_valid"") or a string containing the first encountered error message (when "return_type=>"str_errmsg"") .

If you use "return_type=>"hash_details"", the generated validator code will also return the input data after the default is filled in or coercion is done in the "value" key of the result hashref. Or, if you do not need a validator that checks for all errors/warnings, you can use "return_type=>"bool_valid+val"" or "return_type=>"str_errmsg+val"". For example:

 my $v = gen_validator(["date", {"x.perl.coerce_to"=>"DateTime"}],
                       {return_type=>"str_errmsg+val"});
 my ($err, $val) = @{ $v->("2016-05-14") };

The validator will return an error message string (or an empty string if validation succeeds) as well as the final value. In the example above, $val will contain a DateTime object. This is convenient because the final value is what is usually used further after validation process.

It shows the path to data item that fails the validation, e.g.:

 my $v = gen_validator([array => of => [int=>min=>5], {return_type=>"str_errmsg"});
 say $v->([10, 5, "x"]);

prints:

 @[2]: Input is not of type integer

which means that the third element (subscript 2) of the array fails the validation. Another example:

 my $v = gen_validator([array => of => [hash=>keys=>{a=>"int"}]]);
 say $v->([{}, {a=>1.1}]);

prints:

 @[1][a]: Input is not of type integer

Note that for validator that returns full result hashref ("return_type=>"hash_details"") the error messages in the "errors" key are also keyed with data path, albeit in a slightly different format (i.e. slash-separated, e.g. 2 and "1/a") for easier parsing.

If you are generating Perl code from schema, you can pass "debug=>1" option so the code contains logging (Log::ger-based) and other debugging information, which you can display. For example:

 % TRACE=1 perl -MLog::ger::LevelFromEnv -MLog::ger::Output=Screen \
   -MData::Sah=gen_validator -E'
   $v = gen_validator([array => of => [hash => {req_keys=>["a"]}]],
                      {return_type=>"str_errmsg", debug=>1});
   say "Validation result: ", $v->([{a=>1}, "x"]);'

will output:

 ...
 [spath=[]]skip if undef ...
 [spath=[]]check type 'array' ...
 [spath=['of']]clause: {"of":["hash",{"req_keys":["a"]}]} ...
 [spath=['of']]skip if undef ...
 [spath=['of']]check type 'hash' ...
 [spath=['of','req_keys']]clause: {"req_keys":["a"]} ...
 [spath=['of']]skip if undef ...
 [spath=['of']]check type 'hash' ...
 Validation result: [spath=of]@1: Input is not of type hash

Data::Sah offers some options in code generation. Beside compiling the validator code into a subroutine, there are also some other options. Examples:

  • Dist::Zilla::Plugin::Rinci::Validate

    This plugin inserts the generated code (without the "sub { ... }" wrapper) to validate the content of %args right before "# VALIDATE_ARG" or "# VALIDATE_ARGS" like below:

     $SPEC{foo} = {
         args => {
             arg1 => { schema => ..., req=>1 },
             arg2 => { schema => ... },
         },
         ...
     };
     sub foo {
         my %args = @_; # VALIDATE_ARGS
     }
        

    The schemas will be retrieved from the Rinci metadata ($SPEC{foo} above). This means, subroutines in your built distribution will do argument validation.

  • Perinci::Sub::Wrapper

    This module is part of the Perinci family. What the module does is basically wrap your subroutine with a wrapper code that can include validation code (among others). This is a convenient way to add argument validation to an existing subroutine/code.

If set to true, will log (using Log::ger, at the trace level) the validator code being generated. See "SYNOPSIS" or "FAQ" for example on how to see this log message.

Please visit the project's homepage at <https://metacpan.org/release/Data-Sah>.

Source repository is at <https://github.com/perlancar/perl-Data-Sah>.

Data::Sah::Tiny, Params::Sah

Other interpreted validators

Params::Validate is very fast, although minimal. Data::Rx, Kwalify, Data::Verifier, Data::Validator, JSON::Schema, Validation::Class.

For Moo/Mouse/Moose stuffs: Moose type system, MooseX::Params::Validate, among others.

Form-oriented: Data::FormValidator, FormValidator::Lite, among others.

Other compiled validators

Type::Tiny

Params::ValidationCompiler

perlancar <perlancar@cpan.org>

  • mauke <lukasmai.403@gmail.com>
  • Michal Sedlák <sedlakmichal@gmail.com>
  • Steven Haryanto <stevenharyanto@gmail.com>
  • Steven Haryanto <steven@masterweb.net>
  • Szymon Nieznański <s.nez@member.fsf.org>

To contribute, you can send patches by email/via RT, or send pull requests on GitHub.

Most of the time, you don't need to build the distribution yourself. You can simply modify the code, then test via:

 % prove -l

If you want to build the distribution (e.g. to try to install it locally on your system), you can install Dist::Zilla, Dist::Zilla::PluginBundle::Author::PERLANCAR, Pod::Weaver::PluginBundle::Author::PERLANCAR, and sometimes one or two other Dist::Zilla- and/or Pod::Weaver plugins. Any additional steps required beyond that are considered a bug and can be reported to me.

This software is copyright (c) 2024, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012 by perlancar <perlancar@cpan.org>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

Please report any bugs or feature requests on the bugtracker website <https://rt.cpan.org/Public/Dist/Display.html?Name=Data-Sah>

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

2024-02-16 perl v5.40.2

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.