USEARCH v12

UCHIME2 algorithm


See also
uchime2_ref command
uchime3_denovo command

UCHIME2 is an algorithm for detecting chimeric sequences . It is an update of the UCHIME algorithm with some new features. It is implemented in the uchime2_ref command. See UCHIME2 paper for details.

The uchmie2_denovo command is obsolete. It is replaced by the uchime3_denovo command which implements the chimera filtering step in unoise3 . This the same algorithm described in the UCHIME2 paper, except that parameters have been adjusted to reduce the number of false positives.

I do not recommend using uchime2_ref in a 16S or ITS analysis pipeline . The problem is that you will get high rates of both FPs and FNs unless you have denoised sequences, in which case the chimera removal is a special case which is built into the denoising code (see UCHIME2 paper for details).

It is better to use unoise3 or cluster_otus for chimera filtering. I believe that unoise is a better approach because it has better resolution: it reconstructs the biological sequences in the reads without 97% clustering. This enables you to resolve species and strains which are >97% similar to each other.

I recommend you use the largest available reference database for uchime2_ref, e.g. SILVA for 16S or UNITE for ITS. My previous advice to use a small, high-quality database was misguided ( wrong! )-- you need a large database to get decent sensitivity, a small database like "gold" will probably be missing many parents in practice.