GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Pattern(3) User Contributed Perl Documentation Pattern(3)

Chemistry::Pattern - Chemical substructure pattern matching

    use Chemistry::Pattern;
    use Chemistry::Mol;
    use Chemistry::File::SMILES;

    # Create a pattern and a molecule from SMILES strings
    my $mol_str = "C1CCCC1C(Cl)=O";
    my $patt_str = "C(=O)Cl";
    my $mol = Chemistry::Mol->parse($mol_str, format => 'smiles');
    my $patt = Chemistry::Pattern->parse($patt_str, format => 'smiles');

    # try to match the pattern
    while ($patt->match($mol)) {
        @matched_atoms = $patt->atom_map;
        print "Matched: (@matched_atoms)\n";
        # should print something like "Matched: (a6 a8 a7)"
    }

This module implements basic pattern matching for molecules. The Chemistry::Pattern class is a subclass of Chemistry::Mol, so patterns have all the properties of molecules and can come from reading the same file formats. Of course there are certain formats (such as SMARTS) that are exclusively used to describe patterns.

To perform a pattern matching operation on a molecule, follow these steps.

1) Create a pattern object, either by parsing a file or string, or by adding atoms and bonds by hand by using Chemistry::Mol methods. Note that atoms and bonds in a pattern should be Chemistry::Pattern::Atom and Chemistry::Patern::Bond objects. Let's assume that the pattern object is stored in $patt and that the molecule is $mol.

2) Execute the pattern on the molecule by calling $patt->match($mol).

3) If $patt->match() returns true, extract the "map" that relates the pattern to the molecule by calling $patt->atom_map or $patt->bond_map. These methods return a list of the atoms or bonds in the molecule that are matched by the corresponding atoms in the pattern. Thus $patt->atom_map(1) would be analogous to the $1 special variable used for regular expresion matching. The difference between Chemistry::Pattern and Perl regular expressions is that atoms and bonds are always captured.

4) If more than one match for the molecule is desired, repeat from step (2) until match() returns false.

Chemistry::Pattern->new(name => value, ...)
Create a new empty pattern. This is just like the Chemistry::Mol constructor, with one additional option: "options", which expects a hash reference (the options themselves are described under the options() method).
$pattern->options(option => value,...)
Available options:
overlap
If true, matches may overlap. For example, the CC pattern could match twice on propane if this option is true, but only once if it is false. This option is true by default.
permute
Sometimes there is more than one way of matching the same set of pattern atoms on the same set of molecule atoms. If true, return these "redundant" matches. For example, the CC pattern could match ethane with two different permutations (forwards and backwards). This option is false by default.
$patt->reset
Reset the state of the pattern matching object, so that it begins the next match from scratch instead of where it left off after the last one.
$pattern->atom_map
Returns the list of atoms that matched the last time $pattern->match was called.
$pattern->bond_map
Returns the list of bonds that matched the last time $pattern->match was called.
$pattern->match($mol, %options)
Returns true if the pattern matches the molecule. If called again for the same molecule, continues matching where it left off (in a way similar to global regular expressions under scalar context). When there are no matches left, returns false. To force the match to always start from scratch instead of continuing where it left off, the "reset" option may be used.

    $pattern->match($mol, atom => $atom)
    

If atom => $atom is given as an option, match will only look for matches that start at $atom (which should be an atom in $mol, of course). This is somewhat analog to anchored regular expressions.

To find out which atoms and bonds matched, use the atom_map and bond_map methods.

0.27

Chemistry::Pattern::Atom, Chemistry::Pattern::Bond, Chemistry::Mol, Chemistry::File, Chemistry::File::SMARTS.

The PerlMol website <http://www.perlmol.org/>

Ivan Tubert-Brohman <itub@cpan.org>

Copyright (c) 2009 Ivan Tubert-Brohman. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
2009-05-10 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.