![]() |
![]()
| ![]() |
![]()
NAMElalign - compare two protein or DNA sequences for local similarity and show the local sequence alignments plalign,flalign - compare two sequences for local similarity and plot the local sequence alignments SYNOPSISlalign [-EKfgiImnNOQqrRswxZ] sequence-file-1
sequence-file-2
DESCRIPTIONlalign and plalign programs compare two sequences looking for local sequence similarities. lalign/plalign use code developed by X. Huang and W. Miller (Adv. Appl. Math. (1991) 12:337-357) for the "sim" program. (Version 2.1 uses sim2 code.) While ssearch reports only the best alignment between the query sequence and the library sequence, lalign and plalign will report all the alignments with pair-wisse probabilities < 0.05 (default, modified with -E #) between the two sequences lalign shows the actual local alignments between the two sequences and their scores, while plalign produces a plot of the alignments that looks similar to a `dot-matrix' homology plot. On Unix™ systems, plalign generates postscript output. flalign generates graphic commands for the GCG "figure" program. Probability estimates for the lalign/plalign/flalign programs are based on the parameters provided by Altschul and Gish (1996) Meth. Enzymol. 266:460-480. These parameters are available for BLOSUM50, BLOSUM62, and PAM250 scoring matrices with specific gap penalties, and also for DNA comparison with a gap penalty of -16, -4. Probability estimates are not available for other scoring matrices and gap penalties. The E(10,000) values reported with the alignments are the pairwise-alignment probabilities multiplied by 10,000. These estimates approximate the significance from a search of a 10,000 entry database. They differ from the -E 0.05 initial theshold by the same factor of 10,000. This is an unfortunate inconsistency, but I believe that it is helpful to provide the perspective of a database search. The lalign/plalign/fasta programs use a standard text format sequence file. Lines beginning with '>' or ';' are considered comments and ignored; sequences can be upper or lower case, blanks,tabs and unrecognizable characters are ignored. lalign/plalign expect sequences to use the single letter amino acid codes, see protcodes(1) . OPTIONSlalign and the other programs can be directed to change the scoring matrix, search parameters, output format, and default search directories by entering options on the command line (preceeded by a `-'). All of the options should preceed the file name and ktup arguments). Alternately, these options can be changed by setting environment variables. The options and environment variables are:
EXAMPLES
Compare the amino acid sequence in the file mchu.aa with itself and report the ten best local alignments. Sequence files should have the form:
Display up to 100 local alignments of the LDL receptor (qrhuld.aa) with epidermal growth factor precursor (egmsmg.aa) with pairwise probabilities better than 0.01. Plot the results on the screen.
Run the lalign program in interactive mode. The program will prompt for the name of two sequence files and the number of alignments to show. SEE ALSOssearch(1), prss(1), fasta(1), protcodes(5), dnacodes(5) AUTHORBill Pearson
|