Manual Reference Pages - CHEMISTRY::FILE::SMILES (3)
Chemistry::File::SMILES - SMILES linear notation parser/writer
# parse a SMILES string
my $s = C1CC1(=O)[O-];
my $mol = Chemistry::Mol->parse($s, format => smiles);
# print a SMILES string
print $mol->print(format => smiles);
# print a unique (canonical) SMILES string
print $mol->print(format => smiles, unique => 1);
# parse a SMILES file
my @mols = Chemistry::Mol->read("file.smi", format => smiles);
# write a multiline SMILES file
Chemistry::Mol->write("file.smi", mols => \@mols);
This module parses a SMILES (Simplified Molecular Input Line Entry
Specification) string. This is a File I/O driver for the PerlMol project.
<http://www.perlmol.org/>. It registers the smiles format with
This parser interprets anything after whitespace as the molecules name;
for example, when the following SMILES string is parsed, $mol->name will be
set to Methyl chloride:
CCl Methyl chloride
The name is not included by default on output. However, if the name option
is defined, the name will be included after the SMILES string, separated by a
print $mol->print(format => smiles, name => 1);
Multiline SMILES and SMILES files
A file or string can contain multiple molecules, one per line.
CCl Methyl chloride
Files with the extension .smi are assumed to have this format.
Atom Mapping Numbers
As an extension for reaction processing, SMILES strings may have atom mapping
numbers, which are introduced after a colon in a bracketed atom. For example,
[C:1]. The mapping number need not be unique. This module reads the mapping
numbers and stores them as the name of the atom ($atom->name).
On output, atom names are not included by default. See the number and
auto_number options below for ways of including them.
The following options are supported in addition to the options mentioned for
Chemistry::File, such as mol_class, format, and fatal.
On output, detect aromatic atoms and bonds by means of the Chemistry::Ring
module, and represent the organic aromatic atoms with lowercase symbols.
When used on output, canonicalize the structure if it hasnt been canonicalized
already and generate a unique SMILES string. This option implies aromatic.
For atoms that have a defined name, print the name as the atom number. For
example, if an ethanol molecule has the name 42 for the oxygen atom and the
other atoms have undefined names, the output would be:
When used on output, number all the atoms explicitly and sequentially. The
output for ethanol would look something like this:
Include the molecule name on output, as described in the previous section.
When used on input, assign single or double bond orders to aromatic or
otherwise unspecified bonds (i.e., generate the Kekule structure). If false,
the bond orders will remain single. This option is true by default. This uses
assign_bond_orders from the Chemistry::Bond::Find module.
Stereochemistry is not supported! Stereochemical descriptors such as @, @@, /,
and \ will be silently ignored on input, and will certainly not be produced on
Reading branches that start before an atom, such as (OC)C, which should be
equivalent to C(OC) and COC, according to some variants of the SMILES
specification. Many other tools dont implement this rule either.
The kekulize option works by increasing the bond orders of atoms that dont
have their usual valences satisfied. This may cause problems if you have atoms
with explicitly low hydrogen counts.
The SMILES Home Page at http://www.daylight.com/dayhtml/smiles/
The Daylight Theory Manual at
The PerlMol website <http://www.perlmol.org/>
Ivan Tubert-Brohman <firstname.lastname@example.org>
Copyright (c) 2009 Ivan Tubert-Brohman. All rights reserved. This program is
free software; you can redistribute it and/or modify it under the same terms as
|perl v5.20.3 ||SMILES (3) ||2010-07-08 |
Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.