This is a driver for the SearchIO system for parsing Exonerate (Guy
Slater) output. You can get Exonerate at
[until Guy puts up a Web reference,publication for it.]).
An optional parameter -min_intron is supported by the new
initialization method. This is if you run Exonerate with a different
minimum intron length (default is 30) the parser will be able to
detect the difference between standard deletions and an intron. Still
some room to play with there that might cause this to get
misinterpreted that has not been fully tested or explored.
The VULGAR and CIGAR formats should be parsed okay now creating HSPs
where appropriate (so merging match states where appropriate rather
than breaking an HSP at each indel as it may have done in the past).
The GFF that comes from exonerate is still probably a better way to go
if you are doing protein2genome or est2genome mapping.
For example you can see this script:
### TODO: Jason, this link is dead, do we have an updated one?
If your report contains both CIGAR and VULGAR lines only the first one
will processed for a given Query/Target pair. If you preferentially
want to use VULGAR or CIGAR add one of these options when initializing
the SearchIO object.
-cigar => 1
-vulgar => 1
Or set them via these methods.