See also
Accept criteria determine whether an alignment is a hit, also called an accept. See also
weak hits
. Hits are written to the
output files
. The -maxhits
N and
-top_hits_only options specify that only the best hits are to be reported. Note that two or more hits may be tied for the best score or identity. Accepted hits are written to an output file sorted by decreasing alignment score (local alignments) or by decreasing identity (global alignments).
In clustering commands based on
UCLUST
(
cluster_fast
and
cluster_smallmem
), accept options determine whether or not a sequence matches a cluster centroid and should be assigned to that cluster. A sequence can match only one centroid; this is usually the first accepted centroid, but this can be changed by increasing the -maxaccepts, in which case it will be the centroid with highest identity (see
termination options
).
Accept criteria do not have default values. If a given accept option is not specified, then the corresponding value is not computed or tested. So for example if -id is the only option given, then identity is the only value that is calculated from the alignment.
If more than one accept option is specified, they are combined with AND, so all of them must be satisfied.
Criteria that do not require an alignment, e.g. -idprefix and -minqt, are tested before an alignment is computed; these can give significant improvements in speed because a target can be rejected without the overhead of computing an alignment. Most of these are not supported by local search commands (
ublast
,
usearch_local
and
search_local
).
The -acceptall option specifies that all hits should be accepted, overriding any other accept options.
Termination options
Option
Real/
Integer
Global/
Local
Need aln?
Description
-
evalue
R
L
Y
Maximum
E-value
. Required for most commands that use local alignments.
-
id
R
GL
Y
Minimum
identity
. Required for most commands that use global alignments.
-query_cov
R
GL
Y
Fraction of the query sequence that is aligned, in the range 0.0 to 1.0. With local alignments, this test is applied AFTER a local alignment is already created, so the effect is to reject local alignments that are too short, NOT to extend them further. With global alignments, columns containing
terminal gaps
are discarded before the test is applied.
-target_cov
R
GL
Y
Fraction of the target sequence that is aligned, in the range 0.0 to 1.0.With local alignments, this test is applied AFTER a local alignment is already created, so the effect is to reject local alignments that are too short, NOT to extend them further. With global alignments, columns containing
terminal gaps
are discarded before the test is applied.
-idprefix
I
G
N
First N letters are identical.
-idsuffix
R
G
N
Last N letters are identical.
-minqt
R
G
N
Minimum value of query_seq_length / target_seq_length.
-maxqt
R
G
N
Maximum value of query_seq_length / target_seq_length.
-minsl
R
G
N
Minimum value of shorter_seq_length / longer_seq_length.
-maxsl
R
G
N
Maximum value of shorter_seq_length / longer_seq_length.
-leftjust
G
Y
No terminal gaps at start of alignment.
-rightjust
G
Y
No terminal gaps at end of alignment.
-self
GL
N
Reject if labels are identical (i.e., reject self-hits).
-selfid
G
N
Reject if sequences are identical (i.e., don't want self-hits).
-maxid
R
GL
Y
Reject if identity is greater. Example: to select hits that are 97% identical to two significant figures use -id
0.965 -maxid 0.975.
-minsizeratio
R
GL
N
Minimum query_size / target_size (see
size annotations
).
-maxsizeratio
R
GL
N
Maximum query_size / target_size (see
size annotations
).
-maxdiffs
I
GL
Y
Maximum number of differences between the sequences, i.e. the maximum
edit distance
. A difference is defined to be an alignment column containing a gap or a substitution.
-maxsubs
I
GL
Y
Maximum number of alignment columns containing substitutions.
-maxgaps
I
GL
Y
Maximum number of alignment columns containing internal gaps (terminal gaps do not count).
-mincols
I
GL
Y
Minimum alignment length, i.e. minimum number of columns in the alignment.
-maxqsize
I
GL
N
Maximum query
size annotation
.
-mintsize
I
GL
N
Minimum target
size annotation
.
-mid
R
GL
Y
Minimum match percent identity, defined as (number of columns containing identities) / (number of columns containing letter pars). Gapped columns are ignored. This is percent identity, not fractional identity like -id, so is in the range 0.0 to 100.0.