|
Resolution: standard / high Figure 5.
Re-estimated fragment size distribution. The distribution of fragment sizes for chimpanzee library G591P4 is plotted above
under three models. The normal distribution with mean and standard deviation given
by the NCBI Trace Archive is plotted as 'Normal TA'. A normal distribution re-estimated
from the placement of mated reads from the library is plotted as 'Normal re-estimate'.
To lessen the effect of outliers, we did an initial estimation of the parameters,
filtered out any mate pair distances that were greater than four standard deviations
away, and then estimated the parameters again. 'Nonparametric' plots the actual density
of mate pair distances after running a cubic smoothing spline. The actual fragment
distribution has a mean of 4,500 rather than the 5,000 listed in the Trace Archive
and is far tighter around the mean than suggested by the other models. In particular,
the 'Normal TA' model would have given us a very inaccurate view of this library,
which is one of the largest for chimpanzee with over 2.3 million reads.
Kelley and Salzberg Genome Biology 2010 11:R28 doi:10.1186/gb-2010-11-3-r28 |