BLAST Search main parameters

Subsequence

Performs search using subsequence of the query sequence. Enter from and to co-ordinates to define the subsequence.

Low Complexity Filter

Mask off segments of the query sequence that have low compositional complexity, as determined by the SEG program of Wootton & Federhen (1993) [Computers and Chemistry 17:149-163] or, for BLASTN, by the DUST program of Tatusov and Lipman (Unpublished, NCBI/Toolkit). Filtering can eliminate statistically significant but biologically uninteresting reports from the blast output (e.g. hits against common acidic-, basic- or proline-rich regions), leaving the more biologically interesting regions of the query sequence available for specific matching against database sequences.

Filtering is only applied to the query sequence (or its translation products), not to the database sequences.

It is not unusual for nothing at all to be masked by SEG, when applied to sequences in UniProtKB/Swiss-Prot, so filtering should not be expected to always yield an effect. Furthermore, in some cases, sequences are masked in their entirety, indicating that the statistical significance of any matches reported against the unfiltered query sequence should be suspect.

Expect

The statistical significance (e-value) threshold for reporting matches against database sequences; the default value is 10, such that 10 matches are expected to be found merely by chance, according to the stochastic model of Karlin and Altschul (1990) [Proc Natl Acad Sci U S A. 87:2264-2268]. If the statistical significance ascribed to a match is greater than the Expect threshold, the match will not be reported. Lower Expect thresholds are more stringent, leading to fewer chance matches being reported. Fractional values are acceptable.

Matrix

The Matrix options are described in the separate Matrix Help File.

Graphical Overview

An overview of the database sequences aligned to the query sequence is shown. The score of each alignment is indicated by one of five different colors, which divides the range of scores into five groups. Multiple alignments on the same database sequence are connected by a striped line. Mousing over a hit sequence causes the definition and score to be shown in the window at the top, clicking on a hit sequence takes the user to the associated alignments.

Alignment View

A selection of alignment output formats are available. Click the Alignment View dropdown to select an output style.

Descriptions

Restricts the number of short descriptions of matching sequences reported to the number specified [default: 100]. See also Expect.

Alignments

Restricts database sequences to the number specified for which high-scoring segment pairs (HSPs) are reported default: 50]. If more database sequences than this happen to satisfy the statistical significance threshold for reporting (see Expect, above), only the matches ascribed the greatest statistical significance are reported.