False-positive homology predictions in simulated data matrices with densities equivalent to those in 'background' duplications in a BLASTP comparison of Arabidopsis chromosomes 1 and 2. Compare with true positive and false positives in actual data in Figure 4. False positives increase rapidly if the matrix compression factor (horizontal axis) and hit density (vertical axis; determined by bit-score cutoff) are such that simulated hits occur close enough together that spurious diagonals having the specified quality and hit number can be found in the simulated data. Parameters should be chosen to avoid false positives. Parameters used in Figures 1 and 3 are: quality = 9, bit score = 650 (approximately corresponding in this dataset to BLAST expect values of < 10-68), compression factor = 26. These selections are indicated with a star in this graph.
Cannon et al. Genome Biology 2003 4:R68 doi:10.1186/gb-2003-4-10-r68