A very symplistic FASTA file parser. To use it, you need to pass an argument
that specifies the data type of the FASTA records into the parse function, i.e.
my $project = parse(
-type => dna, # or rna, protein
-format => fasta,
-file => infile.fa,
-as_project => 1
For each FASTA record, the first word on the definition line is used as the
name of the produced datum object. The entire line is assigned to:
$datum->set_generic( fasta_def_line => $line )
So you can retrieve it by calling:
my $line = $datum->get_generic(fasta_def_line);
BioPerl actually parses definition lines to get GIs and such out of there, so if
youre looking for that, use Bio::SeqIO from the bioperl-live distribution.
You can always pass the resulting Bio::Seq objects to
Bio::Phylo::Matrices::Datum->new_from_bioperl to turn the Bio::Seq objects
that Bio::SeqIO produces into Bio::Phylo::Matrices::Datum objects.