 |
|
| |
Data::Sah::Compiler(3) |
User Contributed Perl Documentation |
Data::Sah::Compiler(3) |
Data::Sah::Compiler - Base class for Sah compilers
(Data::Sah::Compiler::*)
This document describes version 0.917 of Data::Sah::Compiler (from
Perl distribution Data-Sah), released on 2024-02-16.
- v => int
Version of compilation data structure. Currently at 2.
Whenever there's a backward-incompatible change introduced in the
structure, this version number will be bumped. Client code can check
this key to deliberately fail when it encounters version number that it
can't handle.
- args => HASH
Arguments given to compile().
- compiler => OBJ
The compiler object.
- compiler_name => str
Compiler name, e.g. "perl",
"js".
- is_inner => bool
Convenience. Will be set to 1 when this compilation is a
subcompilation (i.e. compilation of a subschema). You can also check for
"outer_cd" to find out if this
compilation is an inner compilation.
- outer_cd => HASH
If compilation is called from within another
compile(), this will be set to the outer
compilation's $cd. The inner compilation will
inherit some values from the outer, like list of types
("th_map") and function sets
("fsh_map").
- th_map => HASH
Mapping of fully-qualified type names like
"int" and its
"Data::Sah::Compiler::*::TH::*" type
handler object (or array, a normalized schema).
- fsh_map => HASH
Mapping of function set name like
"core" and its
"Data::Sah::Compiler::*::FSH::*"
handler object.
- schema => ARRAY
The current schema (normalized) being processed. Since schema
can contain other schemas, there will be subcompilation and this value
will not necessarily equal to
"$cd->{args}{schema}".
- spath = ARRAY
An array of strings, with empty array
("[]") as the root. Point to current
location in schema during compilation. Inner compilation will
continue/append the path.
Example:
# spath, with pointer to location in the schema
spath: ["elems"] ----
\
schema: ["array", {elems => ["float", [int => {min=>3}], [int => "div_by&" => [2, 3]]]}
spath: ["elems", 0] ------------
\
schema: ["array", {elems => ["float", [int => {min=>3}], [int => "div_by&" => [2, 3]]]}
spath: ["elems", 1, "min"] ---------------------
\
schema: ["array", {elems => ["float", [int => {min=>3}], [int => "div_by&" => [2, 3]]]}
spath: ["elems", 2, "div_by", 1] -------------------------------------------------
\
schema: ["array", {elems => ["float", [int => {min=>3}], [int => "div_by&" => [2, 3]]]}
Note: aside from "spath",
there is also the analogous "dpath"
which points to the location of data (e.g. array element, hash
key). But this is declared and maintained by the generated code, not by
the compiler.
- th => OBJ
Current type handler.
- type => STR
Current type name.
- clsets => ARRAY
All the clause sets. Each schema might have more than one
clause set, due to processing base type's clause set.
- clset => HASH
Current clause set being processed. Note that clauses are
evaluated not strictly in clset order, but instead based on expression
dependencies and priority.
- clset_dlang => HASH
Default language of the current clause set. This value is
taken from
"$cd->{clset}{default_lang}" or
"$cd->{outer_cd}{default_lang}" or
the default "en_US".
- clset_num => INT
Set to 0 for the first clause set, 1 for the second, and so
on. Due to merging, we might process more than one clause set during
compilation.
- uclset => HASH
Short for "unprocessed clause set", a shallow copy
of "clset", keys will be removed from
here as they are processed by clause handlers, remaining keys after
processing the clause set means they are not recognized by hooks and
thus constitutes an error.
- uclsets => ARRAY
All the "uclset" for each
clause set.
- clause => STR
Current clause name.
- cl_meta => HASH
Metadata information about the clause, from the clause
definition. This include "prio"
(priority), "attrs" (list of
attributes specific for this clause),
"allow_expr" (whether clause allows
expression in its value), etc. See
"Data::Sah::Type::$TYPENAME" for more
information.
- cl_value => ANY
Clause value. Note: for putting in generated code, use
"cl_term".
The clause value will be coerced if there are applicable
coercion rules. To get the raw/original value as the schema specifies
it, see "cl_raw_value".
- cl_raw_value => any
Like "cl_value", but without
any coercion/filtering done to the value.
- cl_term => STR
Clause value term. If clause value is a literal
(".is_expr" is false) then it is
produced by passing clause value to literal().
Otherwise, it is produced by passing clause value to
expr().
- cl_is_expr => BOOL
A copy of
"$cd->{clset}{"${clause}.is_expr"}",
for convenience.
- cl_op => STR
A copy of
"$cd->{clset}{"${clause}.op"}",
for convenience.
- cl_is_multi => BOOL
Set to true if cl_value contains multiple clause values. This
will happen if ".op" is either
"and",
"or", or
"none" and
"$cd->{CLAUSE_DO_MULTI}" is set to
true.
- indent_level => INT
Current level of indent when printing result using
"$c->line()". 0 means
unindented.
- all_expr_vars => ARRAY
All variables in all expressions in the current schema (and
all of its subschemas). Used internally by compiler. For example (XXX
syntax not not finalized):
# schema
[array => {of=>'str1', min_len=>1, 'max_len=' => '$min_len*3'},
{def => {
str1 => [str => {min_len=>6, 'max_len=' => '$min_len*2',
check=>'substr($_,0,1) eq "a"'}],
}}]
all_expr_vars => ['schema:///clsets/0/min_len', # or perhaps .../min_len/value
'schema://str1/clsets/0/min_len']
This data can be used to order the compilation of clauses
based on dependencies. In the above example,
"min_len" needs to be evaluated before
"max_len" (especially if
"min_len" is an expression).
- modules => array of hash
List of modules that are required, one way or another. Each
element is a hash which must contain at least the
"name" key (module name). There are
other keys like "version" (minimum
version), "phase" (explained below).
Some languages might add other keys, like
"perl" with
"use_statement" (statement to load/use
the module, used by e.g. pragmas like "no warnings
'void'" which are not the regular
"require MODULE" statement).
Generally, duplicate entries (entries with the same
"name" and
"phase") are avoided, except in
special cases like Perl pragmas.
There are runtime modules
("phase" key set to
"runtime"), which are required by the
generated code when running. For each entry, the only required key is
"name". Other keys include:
"version" (minimum version). Some
languages have some additional rule for this, e.g. perl has
"use_statement" (how to use the
module, e.g. for pragma, like "no warnings
'void'").
There are also compile-time modules
("phase" key set to
"compile"), which are required during
compilation of schema. This include coercion rule modules like
Data::Sah::Coerce::perl::To_date::From_float::Epoch, and so on. This
information might be useful for distributions that use Data::Sah.
Because Data::Sah is a modular library, where there are third party
extensions for types, coercion rules, and so on, listing these modules
as dependencies instead of a single
"Data::Sah" will ensure that
dependants will pull the right distribution during installation.
- ccls => [HASH, ...]
(Result) Compiled clauses, collected during processing of
schema's clauses. Each element will contain the compiled code in the
target language, error message, and other information. At the end of
processing, these will be joined together.
- result => ...
(Result) The final result. For most compilers, it will be
string/text.
- has_constraint_clause => bool
Convenience. True if there is at least one constraint clause
in the schema. This excludes special clause
"req" and
"forbidden".
- has_subschema => bool
Convenience. True if there is at least one clause which
contains a subschema.
Reference to the main Data::Sah object.
Reference to expression compiler object. In the perl compiler, for
example, this will be an instance of Language::Expr::Compiler::Perl
object.
$c->compile(%args) => HASH
Compile schema into target language.
Arguments ("*" denotes required
arguments, subclass may introduce others):
- data_name => STR (default: 'data')
A unique name. Will be used as default for variable names,
etc. Should only be comprised of letters/numbers/underscores.
- schema* => STR|ARRAY
The schema to use. Will be normalized by compiler, unless
"schema_is_normalized" is set to
true.
- lang => STR (default: from LANG/LANGUAGE or
"en_US")
Desired output human language. Defaults (and falls back to)
"en_US".
- mark_missing_translation => BOOL (default: 1)
If a piece of text is not found in desired human language,
"en_US" version of the text will be
used but using this format:
(en_US:the text to be translated)
If you do not want this marker, set the
"mark_missing_translation" option to
0.
- locale => STR
Locale name, to be set during generating human text
description. This sometimes needs to be if setlocale() fails to
set locale using only "lang".
- schema_is_normalized => BOOL (default: 0)
If set to true, instruct the compiler not to normalize the
input schema and assume it is already normalized.
- allow_expr => BOOL (default: 1)
Whether to allow expressions. If false, will die when
encountering expression during compilation. Usually set to false for
security reason, to disallow complex expressions when schemas come from
untrusted sources.
- on_unhandled_attr => STR (default: 'die')
What to do when an attribute can't be handled by compiler
(either it is an invalid attribute, or the compiler has not implemented
it yet). Valid values include: "die",
"warn",
"ignore".
- on_unhandled_clause => STR (default: 'die')
What to do when a clause can't be handled by compiler (either
it is an invalid clause, or the compiler has not implemented it yet).
Valid values include: "die",
"warn",
"ignore".
- indent_level => INT (default: 0)
Start at a specified indent level. Useful when generated code
will be inserted into another code (e.g. inside
"sub {}" where it is nice to be able
to indent the inside code).
- skip_clause => ARRAY (default: [])
List of clauses to skip (to assume as if it did not exist).
Example when compiling with the human compiler:
# schema
[int => {default=>1, between=>[1, 10]}]
# generated human description in English
integer, between 1 and 10, default 1
# generated human description, with skip_clause => ['default']
integer, between 1 and 10
Compilation data
During compilation, compile() will call various hooks
(listed below). The hooks will be passed compilation data
($cd) which is a hashref containing various
compilation state and result. Compilation data is written to this hashref
instead of on the object's attributes to make it easy to do recursive
compilation (compilation of subschemas).
Keys that are put into this compilation data include input data,
compilation state, and others. Many of these keys might exist only
temporarily during certain phases of compilation and will no longer exist at
the end of compilation, for example
"clause" will only exist during processing
of a clause and will be seen by hooks like
"before_clause" and
"after_clause", it will not be seen by
"before_all_clauses" or
"after_compile".
For a list of keys, see "COMPILATION DATA KEYS".
Subclasses may add more data; see their respective documentation.
Return value
The compilation data will be returned as return value. Main result
will be in the "result" key. There is also
"ccls", and subclasses may put additional
results in other keys. Final usable result might need to be pieced together
from these results, depending on your needs.
Hooks
By default this base compiler does not define any hooks;
subclasses can define hooks to implement their compilation process. Each
hook will be passed compilation data, and should modify or set the
compilation data as needed. The hooks that compile() will call at
various points, in calling order, are:
- $c->before_compile($cd)
Called once at the beginning of compilation.
- $c->before_handle_type($cd)
- $th->handle_type($cd)
- $c->before_all_clauses($cd)
Called before calling handler for any clauses.
- $th->before_all_clauses($cd)
Called before calling handler for any clauses, after
compiler's before_all_clauses().
- $c->before_clause($cd)
Called for each clause, before calling the actual clause
handler ($th->clause_NAME() or
$th->clause).
- $th->before_clause($cd)
After compiler's before_clause() is called, type
handler's before_clause() will also be called if
available.
Input and output interpretation is the same as compiler's
before_clause().
- $th->before_clause_NAME($cd)
Can be used to customize clause.
Introduced in v0.10.
- $th->clause_NAME($cd)
Clause handler. Will be called only once (if
"$cd-"{CLAUSE_DO_MULTI}> is set to
by other hooks before this) or once for each value in a multi-value
clause (e.g. when ".op" attribute is
set to "and" or
"or"). For example, in this
schema:
[int => {"div_by&" => [2, 3, 5]}]
clause_div_by() can be called only
once with "$cd->{cl_value}" set to
[2, 3, 5] or three times, each with
"$cd->{value}" set to 2, 3, and 5
respectively.
- $th->after_clause_NAME($cd)
Can be used to customize clause.
Introduced in v0.10.
- $th->after_clause($cd)
Called for each clause, after calling the actual clause
handler ($th->clause_NAME()).
- $c->after_clause($cd)
Called for each clause, after calling the actual clause
handler ($th->clause_NAME()).
Output interpretation is the same as
$th->after_clause().
- $th->after_all_clauses($cd)
Called after all clauses have been compiled, before compiler's
after_all_clauses().
- $c->after_all_clauses($cd)
Called after all clauses have been compiled.
- $c->after_compile($cd)
Called at the very end before compiling process end.
Please visit the project's homepage at
<https://metacpan.org/release/Data-Sah>.
Source repository is at
<https://github.com/perlancar/perl-Data-Sah>.
perlancar <perlancar@cpan.org>
To contribute, you can send patches by email/via RT, or send pull
requests on GitHub.
Most of the time, you don't need to build the distribution
yourself. You can simply modify the code, then test via:
% prove -l
If you want to build the distribution (e.g. to try to install it
locally on your system), you can install Dist::Zilla,
Dist::Zilla::PluginBundle::Author::PERLANCAR,
Pod::Weaver::PluginBundle::Author::PERLANCAR, and sometimes one or two other
Dist::Zilla- and/or Pod::Weaver plugins. Any additional steps required
beyond that are considered a bug and can be reported to me.
This software is copyright (c) 2024, 2022, 2021, 2020, 2019, 2018,
2017, 2016, 2015, 2014, 2013, 2012 by perlancar
<perlancar@cpan.org>.
This is free software; you can redistribute it and/or modify it
under the same terms as the Perl 5 programming language system itself.
Please report any bugs or feature requests on the bugtracker
website
<https://rt.cpan.org/Public/Dist/Display.html?Name=Data-Sah>
When submitting a bug or request, please include a test-file or a
patch to an existing test-file that illustrates the bug or desired
feature.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc.
|