Locale::Po4a::TeX - convert TeX documents and derivatives from/to
PO files
The po4a (PO for anything) project goal is to ease translations
(and more interestingly, the maintenance of translations) using gettext
tools on areas where they were not expected like documentation.
Locale::Po4a::TeX is a module to help the translation of TeX
documents into other [human] languages. It can also be used as a base to
build modules for TeX-based documents.
Users should probably use the LaTeX module, which inherits from
the TeX module and contains the definitions of common LaTeX commands.
This module can be used directly to handle generic TeX documents.
This will split your document in smaller blocks (paragraphs, verbatim
blocks, or even smaller like titles or indexes).
There are some options (described in the next section) that can
customize this behavior. If this doesn't fit to your document format you're
encouraged to write your own derivative module from this, to describe your
format's details. See the section WRITING DERIVATIVE MODULES below,
for the process description.
This module can also be customized by lines starting with "%
po4a:" in the TeX file. This process is described in the INLINE
CUSTOMIZATION section.
These are this module's particular options:
- debug
- Activate debugging for some internal mechanisms of this module. Use the
source to see which parts can be debugged.
- no_wrap
- Comma-separated list of environments which should not be re-wrapped.
Note that there is a difference between verbatim and no_wrap
environments. There is no command and comments analysis in verbatim
blocks.
If this environment was not already registered, po4a will
consider that this environment does not take any parameters.
- exclude_include
- Colon-separated list of files that should not be included by \input and
\include.
- definitions
- The name of a file containing definitions for po4a, as defined in the
INLINE CUSTOMIZATION section. You can use this option if it is not
possible to put the definitions in the document being translated.
- verbatim
- Comma-separated list of environments which should be taken as verbatim.
If this environment was not already registered, po4a will
consider that this environment does not take any parameters.
Use these options to override the default behavior of the defined
commands.
The TeX module can be customized with lines starting by %
po4a:. These lines are interpreted as commands to the parser. The
following commands are recognized:
- % po4a: command command1 alias command2
- Indicates that the arguments of the command1 command should be
treated as the arguments of the command2 command.
- % po4a: command command1 parameters
- This describes in detail the parameters of the command1 command.
This information will be used to check the number of arguments and their
types.
You can precede the command1 command by
- an asterisk (*)
- po4a will extract this command from paragraphs (if it is located at the
beginning or the end of a paragraph). The translators will then have to
translate the parameters that are marked as translatable.
- a plus (+)
- As for an asterisk, the command will be extracted if it appear at an
extremity of a block, but the parameters won't be translated separately.
The translator will have to translate the command concatenated to all its
parameters. This keeps more context, and is useful for commands with small
words in parameter, which can have multiple meanings (and translations).
Note: In this case you don't have to specify which parameters
are translatable, but po4a must know the type and number of
parameters.
- a minus (-)
- In this case, the command won't be extracted from any block. But if it
appears alone on a block, then only the parameters marked as translatable
will be presented to the translator. This is useful for font commands.
These commands should generally not be separated from their paragraph (to
keep the context), but there is no reason to annoy the translator with
them if a whole string is enclosed in such a command.
The parameters argument is a set of [] (to indicate an
optional argument) or {} (to indicate a mandatory argument). You can place
an underscore (_) between these brackets to indicate that the parameter must
be translated. For example:
% po4a: command *chapter [_]{_}
This indicates that the chapter command has two parameters: an
optional (short title) and a mandatory one, which must both be translated.
If you want to specify that the href command has two mandatory parameters,
that you don't want to translate the URL (first parameter), and that you
don't want this command to be separated from its paragraph (which allow the
translator to move the link in the sentence), you can use:
% po4a: command -href {}{_}
In this case, the information indicating which arguments must be
translated is only used if a paragraph is only composed of this href
command.
- % po4a: environment env parameters
- This defines the parameters accepted by the env environment and
specifies the ones to be translated. This information is later used to
check the number of arguments of the \begin command. The syntax of the
parameters argument is the same as described for the others
commands. The first parameter of the \begin command is the name of the
environment. This parameter must not be specified in the list of
parameters. Here are some examples:
% po4a: environment multicols {}
% po4a: environment equation
As for the commands, env can be preceded by a plus (+)
to indicate that the \begin command must be translated with all its
arguments.
- % po4a: separator env
"regex"
- Indicates that an environment should be split according to the given
regular expression.
The regular expression is delimited by quotes. It should not
create any back-reference. You should use (?:) if you need a group. It
may also need some escapes.
For example, the LaTeX module uses the
"(?:&|\\\\)" regular expression to translate separately
each cell of a table (lines are separated by '\\' and cells by
'&').
The notion of environment is expanded to the type displayed in
the PO file. This can be used to split on "\\\\" in the first
mandatory argument of the title command. In this case, the environment
is title{#1}.
- % po4a: verbatim environment env
- Indicate that env is a verbatim environment. Comments and commands
will be ignored in this environment.
If this environment was not already registered, po4a will
consider that this environment does not take any parameters.
- pre_trans
- post_trans
- Add a string as a comment to be added around the next translated element.
This is mostly useful to the texinfo module, as comments are automatically
handled in TeX.
- translate
- Wrapper around Transtractor's translate, with pre- and post-processing
filters.
Comments of a paragraph are inserted as a PO comment for the
first translated string of this paragraph.
- get_leading_command($buffer)
- This function returns:
- A command name
- If no command is found at the beginning of the given buffer, this string
will be empty. Only commands that can be separated are considered. The
%separated_command hash contains the list of these
commands.
- A variant
- This indicates if a variant is used. For example, an asterisk (*) can be
added at the end of sections command to specify that they should not be
numbered. In this case, this field will contain "*". If there is
no variant, the field is an empty string.
- An array of tuples (type of
argument, argument)
- The type of argument can be either '{' (for mandatory arguments) or '['
(for optional arguments).
- The remaining buffer
- The rest of the buffer after the removal of this leading command and its
arguments. If no command is found, the original buffer is not touched and
returned in this field.
- get_trailing_command($buffer)
- The same as get_leading_command, but for commands at the end of a
buffer.
- translate_buffer
- Recursively translate a buffer by separating leading and trailing commands
(those which should be translated separately) from the buffer.
If a function is defined in
%translate_buffer_env for the current
environment, this function will be used to translate the buffer instead
of translate_buffer().
- read
- Overloads Transtractor's read().
- read_file
- Recursively read a file, appending included files which are not listed in
the @exclude_include array. Included files are
searched using the kpsewhich command from the Kpathsea library.
Except from the file inclusion part, it is a cut and paste
from Transtractor's read.
- parse_definition_file
- Subroutine for parsing a file with po4a directives (definitions for new
commands).
- parse_definition_line
- Parse a definition line of the form "% po4a: ".
See the INLINE CUSTOMIZATION section for more
details.
- is_closed
- parse
Command and environment functions take the following arguments (in
addition to the $self object):
- A command name
- A variant
- An array of (type, argument)
tuples
- The current
environment
The first 3 arguments are extracted by get_leading_command or
get_trailing_command.
Command and environment functions return the translation of the
command with its arguments and a new environment.
Environment functions are called when a \begin command is found.
They are called with the \begin command and its arguments.
The TeX module only proposes one command function and one
environment function: generic_command and generic_environment.
generic_command uses the information specified by
register_generic_command or by adding definition to the TeX file:
% po4a: command command1 parameters
generic_environment uses the information specified by
register_generic_environment or by adding definition to the TeX file:
% po4a: environment env parameters
Both functions will only translate the parameters that were
specified as translatable (with a '_'). generic_environment will append the
name of the environment to the environment stack and generic_command will
append the name of the command followed by an identifier of the parameter
(like {#7} or [#2]).
This module needs more tests.
It was tested on a book and with the Python documentation.
Various points are tagged FIXME in the source.
Locale::Po4a::LaTeX(3pm),
Locale::Po4a::TransTractor(3pm), po4a(7)
Nicolas François <nicolas.francois@centraliens.net>
Copyright © 2004, 2005 Nicolas FRANÇOIS
<nicolas.francois@centraliens.net>.
This program is free software; you may redistribute it and/or
modify it under the terms of GPL (see the COPYING file).