compares a nucleotide query sequence against a nucleotide sequence database
compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames.
compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
A sequence in FASTA format begins with a single-line description, followed by
lines of sequence data. The description line is distinguished from the sequence
data by a greater-than (">") symbol in the first column. It is
recommended that all lines of text be shorter than 80 characters in length. An
example sequence in FASTA
>gi|532319|pir|TVFV2E|TVFV2E envelope protein
Sequences are expected to be represented in the standard IUB/IUPAC amino acid and nucleic acid codes, with these exceptions: lower-case letters are accepted and are mapped into upper-case; a single hyphen or dash can be used to represent a gap of indeterminate length; and in amino acid sequences, U and * are acceptable letters (see below). Before submitting a request, any numerical digits in the query sequence should either be removed or replaced by appropriate letter codes (e.g., N for unknown nucleic acid residue or X for unknown amino acid residue).
The nucleic acid codes supported are:
A --> adenosine M --> A C (amino)
C --> cytidine S --> G C (strong)
G --> guanine W --> A T (weak)
T --> thymidine B --> G T C
U --> uridine D --> G A T
R --> G A (purine) H --> A C T
Y --> T C (pyrimidine) V --> G C A
K --> G T (keto) N --> A G C T (any)
- gap of indeterminate length
For those programs that use amino acid query sequences (BLASTP and TBLASTN),
the accepted amino acid codes are:
A alanine P proline
B aspartate or asparagine Q glutamine
C cystine R arginine
D aspartate S serine
E glutamate T threonine
F phenylalanine U selenocysteine
G glycine V valine
H histidine W tryptophan
I isoleucine Y tyrosine
K lysine Z glutamate or glutamine
L leucine X any
M methionine * translation stop
N asparagine - gap of indeterminate length
Restricts the number of short descriptions of matching sequences reported to the number specified; default is 5 descriptions.
Restricts database sequences to the number specified for which high-scoring segment pairs (HSPs) are reported; the default is 50. If more database sequences than this happen to satisfy the statistical significance threshold for reporting, only the matches ascribed the greatest statistical significance are reported.
The statistical significance threshold for reporting matches against database sequences; the default value is 1e-1, such that 1e-1 matches are expected to be found merely by chance, according to the stochastic model of Karlin and Altschul (1990). If the statistical significance ascribed to a match is greater than the EXPECT threshold, the match will not be reported. Lower EXPECT thresholds are more stringent, leading to fewer chance matches being reported. Fractional values are acceptable.