 |
|
| |
Data::Sah::Compiler::Prog(3) |
User Contributed Perl Documentation |
Data::Sah::Compiler::Prog(3) |
Data::Sah::Compiler::Prog - Base class for programming language
compilers
This document describes version 0.917 of Data::Sah::Compiler::Prog
(from Perl distribution Data-Sah), released on 2024-02-16.
This class is derived from Data::Sah::Compiler. It is used as base
class for compilers which compile schemas into code (validator) in several
programming languages, Perl (Data::Sah::Compiler::perl) and JavaScript
(Data::Sah::Compiler::js) being two of them. (Other similar programming
languages like PHP and Ruby might also be supported later on if needed).
Compilers using this base class are flexible in the kind of code
they produce:
- configurable validator return type
Can generate validator that returns a simple bool result, str,
or full data structure (containing errors, warnings, and potentially
other information).
- configurable data term
For flexibility in combining the validator code with other
code, e.g. putting inside subroutine wrapper (see Perinci::Sub::Wrapper)
or directly embedded to your source code (see
Dist::Zilla::Plugin::Rinci::Validate).
The compiler generates code in the following form:
EXPR && EXPR2 && ...
where "EXPR" can be a single
expression or multiple expressions joined by the list operator (which Perl
and JavaScript support). Each "EXPR" is
typically generated out of a single schema clause. Some pseudo-example of
generated JavaScript code:
(data >= 0) # from clause: min => 0
&&
(data <= 10) # from clause: max => 10
Another example, a fuller translation of schema
"[int => {min=>0, max=>10}]" to
Perl, returning string result (error message) instead of boolean:
# from clause: req => 0
!defined($data) ? 1 : (
# type check
($data =~ /^[+-]?\d+$/ ? 1 : ($err //= "Data is not an integer", 0))
&&
# from clause: min => 0
($data >= 0 ? 1 : ($err //= "Must be at least 0", 0))
&&
# from clause: max => 10
($data <= 10 ? 1 : ($err //= "Must be at most 10", 0))
)
The final validator code will add enclosing subroutine and
variable declaration, loading of modules, etc.
Note: Current assumptions/hard-coded things for the supported
languages: ternary operator ("? :"),
semicolon as statement separator.
- use_dpath => bool
Convenience. This is set when code needs to track data path,
which is when "return_type" argument
is set to something other than "bool"
or "bool+val", and when schema has
subschemas. Data path is used when generating error message string, to
help point to the item in the data structure (an array element, a hash
value) which fails the validation. This is not needed when we want the
validator to only return true/false, and also not needed when we do not
recurse into subschemas.
- data_term => ARRAY
Input data term. Set to
"$cd->{args}{data_term}" or a
temporary variable (if
"$cd->{args}{data_term_is_lvalue}"
is false). Hooks should use this instead of
"$cd->{args}{data_term}" directly,
because aside from the aforementioned temporary variable, data term can
also change, for example if
"default.temp" or
"prefilters.temp" attribute is set,
where generated code will operate on another temporary variable to avoid
modifying the original data. Or when
".input" attribute is set, where
generated code will operate on variable other than data.
- subs => ARRAY
Contains pairs of subroutine names and definition code string,
e.g. "[ [_sahs_zero =>
'sub _sahs_zero { $_[0] == 0 }'], [_sahs_nonzero => 'sub
_sah_s_nonzero { $_[0] != 0 }'] ]". For
flexibility, you'll need to do this bit of arranging yourself to get the
final usable code you can compile in your chosen programming
language.
- vars => HASH
- coerce_to => str
Retrieved from the schema's
"x.$COMPILER.coerce_to" attribute.
Each type handler might have its own default value.
The generated code maintains the following variables.
"_sahv_" prefix stands for "Sah
validator", it is used to minimize clash with data_term.
- _sahv_dpath => ARRAY
Analogous to "spath" in
compilation data, this variable stands for "data path" and is
used to track location within data. If a clause is checking each element
of an array (like the 'each_elem' or 'elems' array clause), this
variable will be adjusted accordingly. Error messages thus can be more
informative by pointing more exactly where in the data the problem
lies.
- tmp_data_term => ANY
As explained in the compile() method,
this is used to store temporary value when checking against clauses.
- _sahv_stack => ARRAY
This variable is used to store validation result of subdata.
It is only used if the validator is returning a string or full
structure, not a single boolean value. See
"Data::Sah::Compiler::js::TH::hash"
for an example.
- _sahv_x
Usually used as temporary variable in short, anonymous
functions.
These usually need not be set/changed by users.
Instance of Data::Sah::Compiler::human, to generate error
messages.
Specify how comments are written in the target language. Either
'cpp' ("//
comment"), 'shell' ("#
comment"), 'c' ("/* comment
*/"), or 'ini' (";
comment"). Each programming language subclass
will set this, for example, the perl compiler sets this to 'shell' while js
sets this to 'cpp'.
$c->compile(%args) => RESULT
Generate a validator (function) for the given schema.
Aside from base class' arguments, this class supports these
arguments (suffix "*" denotes required
argument):
- cache
Bool, default false. If set to true, will generate validators
for base schemas when possible, compile them into functions in the
"Data::Sah::_GeneratedValidators::*",
then have the generated validator code calls these functions. This will
result in smaller validator code and shorter compilation time especially
for large/complex schema that is composed from subschemas. But this will
also create a (usually insignificant) additional overhead of multiple
function calls when doing validation using the generated validator
code.
Only relevant when "name" argument is set. When a
certain named function is already defined, avoid generating the function
declaration again and instead call the defined function.
- data_term
Str. A variable name or an expression in the target language
that contains the data, defaults to var_sigil +
"name" if not specified.
- data_term_is_lvalue
Bool, default true. Whether
"data_term" can be assigned to.
- tmp_data_name
Str. Normally need not be set manually, as it will be set to
"tmp_" . data_name. Used to store temporary data during clause
evaluation.
- tmp_data_term
Str. Normally need not be set manually, as it will be set to
var_sigil . tmp_data_name. Used to store temporary data during clause
evaluation. For example, in JavaScript, the 'int' and 'float' type pass
strings in the type check. But for further checking with the clauses
(like 'min', 'max', 'divisible_by') the string data needs to be
converted to number first. Likewise with prefiltering. This variable
holds the temporary value. The clauses compare against this value. At
the end of clauses, the original data_term is restored. So the output
validator code for schema "[int => min =>
1]" will look something like:
// type check 'int'
type(data)=='number' && Math.round(data)==data || parseInt(data)==data)
&&
// convert to number
(tmp_data = type(data)=='number' ? data : parseFloat(data), true)
&&
// check clause 'min'
(tmp_data >= 1)
- err_term
Str. A variable name or lvalue expression to store error
message(s), defaults to var_sigil +
"err_NAME" (e.g.
$err_data in the Perl compiler).
- var_prefix
Str, default "_sahv_". Prefix for variables declared
by generated code.
- sub_prefix
Str, default "_sahs_". Prefix for subroutines
declared by generated code.
- code_type
Str, default "validator". The kind of code to
generate. For now the only valid (and default) value is 'validator'.
Compiler can perhaps generate other kinds of code in the future.
- return_type
Str, default "bool". Specify what kind of return
value the generated code should produce. Either
"bool_valid",
"bool_valid+val",
"str_errmsg",
"str_errmsg+val", or
"hash_details".
"bool_valid" means generated
validator code should just return true/false depending on whether
validation succeeds/fails.
"bool_valid+val" is like
"bool_valid", but instead of just
"bool_valid" the validator code will
return a two-element arrayref "[bool_valid,
val]" where "val" is the
final value of data (after setting of default, coercion, etc.)
"str_errmsg" means
validation should return an error message string (the first one
encountered) if validation fails and an empty string/undef if validation
succeeds.
"str_errmsg+val" is like
"str_errmsg", but instead of just
"str_errmsg" the validator code will
return a two-element arrayref "[str_errmsg,
val]" where "val" is the
final value of data (after setting of default, coercion, etc.)
"hash_details" means
validation should return a full hash data structure. From this structure
you can check whether validation succeeds, retrieve all the collected
errors/warnings, etc.
- coerce
Bool, default true. If set to false, will not include coercion
code.
- debug
Bool, default false. This is a general debugging option which
should turn on all debugging-related options, e.g. produce more comments
in the generated code, etc. Each compiler might have more specific
debugging options.
If turned on, specific debugging options can be explicitly
turned off afterwards, e.g. "debug=>1,
debug_log=>0" will turn on all debugging options but turn
off the "debug_log" setting.
Currently turning on "debug"
means:
- debug_log
Bool, default false. Whether to add logging to generated code.
This aids in debugging generated code specially for more complex
validation.
- comment
Bool, default true. If set to false, generated code will be
devoid of comments.
- human_hash_values
Hash. Optional. Will be passed to
"hash_values" argument during
compile() by human compiler.
$c->comment($cd, @args) => STR
Generate a comment. For example, in perl compiler:
$c->comment($cd, "123"); # -> "# 123\n"
Will return an empty string if compile argument
"comment" is set to false.
Please visit the project's homepage at
<https://metacpan.org/release/Data-Sah>.
Source repository is at
<https://github.com/perlancar/perl-Data-Sah>.
perlancar <perlancar@cpan.org>
To contribute, you can send patches by email/via RT, or send pull
requests on GitHub.
Most of the time, you don't need to build the distribution
yourself. You can simply modify the code, then test via:
% prove -l
If you want to build the distribution (e.g. to try to install it
locally on your system), you can install Dist::Zilla,
Dist::Zilla::PluginBundle::Author::PERLANCAR,
Pod::Weaver::PluginBundle::Author::PERLANCAR, and sometimes one or two other
Dist::Zilla- and/or Pod::Weaver plugins. Any additional steps required
beyond that are considered a bug and can be reported to me.
This software is copyright (c) 2024, 2022, 2021, 2020, 2019, 2018,
2017, 2016, 2015, 2014, 2013, 2012 by perlancar
<perlancar@cpan.org>.
This is free software; you can redistribute it and/or modify it
under the same terms as the Perl 5 programming language system itself.
Please report any bugs or feature requests on the bugtracker
website
<https://rt.cpan.org/Public/Dist/Display.html?Name=Data-Sah>
When submitting a bug or request, please include a test-file or a
patch to an existing test-file that illustrates the bug or desired
feature.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc.
|