Table 2

Top five performing features for microdeletion and microinsertion discrimination.

Features

MCCa

AUCb

Precision

Recall


Deletion


Disorder (Min, Ave, Max)

0.558, 0.557, 0.551

0.824, 0.825, 0.818

74%, 74%, 73%

85%, 85%, 84%

ASAc (Min, Ave, Max)

0.542, 0.47, 0.302

0.81, 0.781, 0.659

73%, 71%, 68%

88%, 81%, 57%

DNA conservation (Max, Ave, Min)

0.468, 0.367, 0.144

0.781, 0.742, 0,561

68%, 72%, 66%

79%, 71%, 23%

Neffd (Min, Ave, Max)

0.449, 0.439, 0.43

0.735, 0.749, 0.729

68%, 66%, 67%

85%, 87%, 85%

Probability of sheet (Max, Min Ave)

0.32, 0.305, 0.284

0.678, 0.658, 0.632

69%, 69%, 64%

60%, 53%, 51%


Insertion


Disorder (Min, Max, Ave)

0.556, 0.546, 0.545

0.813, 0.816, 0.80

78%, 80%, 79%

75%, 74%, 75%

ASAc (Min, Ave, Max)

0.501, 0.454, 0.317

0.80, 0.78, 0.670

71%, 78%, 71%

85%, 65%, 52%

Neff d (Min, Ave, Max)

0.467, 0.455, 0.438

0.751, 0.747, 0.742

68%, 68%, 67%

86%, 85%, 84%

DNA conservation (Max, Ave, Min)

0.453, 0.422, 0.234

0.758, 0.752, 0.597

72%, 74%, 76%

75%, 65%, 27%

Transition probability of microinsertion to match (Min)

0.372

0.708

72%

62%


Note: Max, min, and ave are arranged in the order of MCC values.

ASA, Solvent accessible surface area; AUC, Area under the curve; MCC, Matthews correlation coefficient; Neff, the number of effective homologous sequences aligned to residues, irrespective of residue type.

Zhao et al. Genome Biology 2013 14:R23   doi:10.1186/gb-2013-14-3-r23

Open Data