Genome Biology

official impact factor 6.89

Open Access Research

Fine scale structural variants distinguish the genomes of Drosophila melanogaster and D. pseudoobscura

Stuart J Macdonald* and Anthony D Long

  • * Corresponding author: Stuart J Macdonald sjm@uci.edu

Author Affiliations

Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697-2525, USA

For all author emails, please log on.

Genome Biology 2006, 7:R67 doi:10.1186/gb-2006-7-7-r67

Published: 27 July 2006

Additional files

Additional data file 1:

Tabulated data on all 5,738 tested orthologous gene pairs. Each row of the table represents a gene. The columns of the table are as follows: Column 1 ('CG') holds the CG identifier for the gene. Column 2 ('gene.name') holds the name of the gene, if any. Column 3 ('Dmel.geneINFO') gives the position of the gene within release 4.2.1 of the Drosophila melanogaster genome. Column 4 ('Dpse.geneINFO') gives the position of the gene within release 1.04 of the D. pseudoobscura genome. Column 5 ('num.exons') gives the number of exons in the gene. Column 6 ('Dmel.gene.bp') provides the length in base pairs of the gene in D. melanogaster. Column 7 ('BLASThit.bp') gives the amount of D. melanogaster sequence, in base pairs, included in the set of above-threshold sliding-window BLAST hits. Column 8 ('num.BLASThits') holds the number of above-threshold sliding-window BLAST hits. Column 9 ('num.inserts') gives the number of large insertion/deletions events detected in the gene. Column 10 ('num.inversions') gives the number of microinversion events detected in the gene.

Format: TXT Size: 464KB Download file

Open Data

Additional data file 2:

Tabulated data on all 95 large insertion/deletion (indel) events detected. Each row of the table represents an indel. The columns of the table are as follows. Column 1 ('CG') holds the CG identifier for the gene. Column 2 ('Dmel.chr') gives the D. melanogaster chromosome on which the gene resides. Columns 3, 4, and 5 ('Dmel.geneSTART', 'Dmel.geneSTOP', and 'Dmel.geneORIENT') give the gene start position, stop position, and orientation, respectively, in release 4.2.1 of the D. melanogaster genome assembly. Column 6 ('Dpse.chr') gives the D. pseudoobscura chromosome on which the gene resides. Columns 7, 8, and 9 ('Dpse.geneSTART', 'Dpse.geneSTOP', and 'Dpse.geneORIENT') give the gene start position, stop position, and orientation, respectively, in release 1.04 of the D. pseudoobscura genome assembly. Columns 10 and 11 ('Dmel.insSTARTREL' and 'Dmel.insSTOPREL') give the position of the indel relative to the start of the host D. melanogaster gene. Columns 12 and 13 ('Dmel.insSTART.genome' and 'Dmel.insSTOP.genome') give the position of the indel in release 4.2.1 of the D. melanogaster genome assembly. Columns 14 and 15 ('Dpse.insSTARTREL' and 'Dpse.insSTOPREL') give the position of the indel relative to the start of the host D. pseudoobscura gene. Columns 16 and 17 ('Dmel.LEN' and 'Dpse.LEN') provide the lengths of the insertion/deletion in each genome. Column 18 ('insert.species') notes the species harboring the insert. Column 19 ('insert.amount') notes the amount of DNA inserted in base pairs. For those indels showing the insertion in D. melanogaster column 20 ('Dmel.insert.annot.TE') notes if the insertion overlaps the position of a known transposable element in the D. melanogaster genome. Column 21 ('insert.BLAST') shows the type of transposable element, if any, the inserted sequence BLASTs to.

Format: TXT Size: 12KB Download file

Open Data

Additional data file 3:

Tabulated data on all 121 microinversions detected by scanning regions within genes. Each row of the table represents a microinversion. The columns of the table are as follows. Column 1 ('CG') holds the CG identifier for the gene. Column 2 ('Dmel.chr') gives the Drosophila melanogaster chromosome on which the gene resides. Columns 3, 4, and 5 ('Dmel.geneSTART', 'Dmel.geneSTOP', and 'Dmel.geneORIENT') give the gene start position, stop position, and orientation, respectively, in release 4.2.1 of the D. melanogaster genome assembly. Column 6 ('Dpse.chr') gives the D. pseudoobscura chromosome on which the gene resides. Columns 7, 8, and 9 ('Dpse.geneSTART', 'Dpse.geneSTOP', and 'Dpse.geneORIENT') give the gene start position, stop position, and orientation, respectively, in release 1.04 of the D. pseudoobscura genome assembly. Column 10 ('inv.num') applies an arbitrary number to each microinversion so independent events within a single gene can be distinguished. Columns 11 and 12 ('Dmel.invSTARTREL' and 'Dmel.invSTOPREL') give the positions of the microinversion breakpoints relative to the start of the host D. melanogaster gene. Columns 13 and 14 ('Dmel.invSTART.genome' and 'Dmel.invSTOP.genome') give the positions of the microinversion breakpoints in release 4.2.1 of the D. melanogaster genome assembly. Column 15 ('Dmel.exon.overlap') is a 0/1 vector indicating whether the microinversion overlaps with an annotated D. melanogaster exon. Column 16 ('Dmel.gene.overlap') gives the identifier of the nested gene an inversion overlaps. Column 17 ('Dmel.invLEN') is the length of the microinversion in D. melanogaster. Columns 18 and 19 ('Dpse.invSTARTREL' and 'Dpse.invSTOPREL') give the positions of the microinversion breakpoints relative to the start of the host D. pseudoobscura gene. Column 20 ('Dpse.invLEN') is the size of the microinversion in D. pseudoobscura. Column 21 ('num.Dmel.introns') is the number of introns in the D. melanogaster host gene. Column 22 ('Dmel.invINTRON') is the number of the intron within which the microinversion resides, column 23 ('Dmel.invINTRON.size') is the size of that intron, and column 24 ('Dmel.invINTRON.sizerank') is the ranked size of that intron, with a 1 indicating the microinversion is present within the largest intron. Columns 25 and 26 ('Dmel.invINTRON.leftdist' and 'Dmel.invINTRON.rightdist') are distances between the 5'-breakpoint and the 3'-breakpoint, respectively, of the microinversion and the nearest flanking exon.

Format: TXT Size: 16KB Download file

Open Data

Additional data file 4:

Tabulated data on all 22 microinversions detected by scanning 2 kb regions upstream and downstream of each conserved orthologous gene pair. Each row of the table represents a microinversion. The columns of the table are as follows. Column 1 ('CG') holds the CG identifier for the gene. Column 2 ('Dmel.chr') gives the D. melanogaster chromosome on which the gene resides. Columns 3, 4, and 5 ('Dmel.geneSTART', 'Dmel.geneSTOP', and 'Dmel.geneORIENT') give the gene start position, stop position, and orientation, respectively, in release 4.2.1 of the D. melanogaster genome assembly. Column 6 ('Dpse.chr') gives the D. pseudoobscura chromosome on which the gene resides. Columns 7, 8, and 9 ('Dpse.geneSTART', 'Dpse.geneSTOP', and 'Dpse.geneORIENT') give the gene start position, stop position, and orientation, respectively, in release 1.04 of the D. pseudoobscura genome assembly. Column 10 ('inv.num') gives an arbitrary number to each microinversion so independent events within a single gene can be distinguished. Columns 11 and 12 ('Dmel.invSTARTREL' and 'Dmel.invSTOPREL') give the positions of the microinversion breakpoints relative to the start of the host D. melanogaster gene. Columns 13 and 14 ('Dmel.invSTART.genome' and 'Dmel.invSTOP.genome') give the positions of the microinversion breakpoints in release 4.2.1 of the D. melanogaster genome assembly. Column 15 ('Dmel.invPOS.relgene') indicates whether the microinversion is 5' or 3' of the test gene. Column 16 ('Dmel.exon.overlap') is a 0/1 vector indicating whether the microinversion overlaps with an annotated exon in D. melanogaster. Column 17 ('Dmel.gene.overlap') gives the identifier of the nested gene an inversion overlaps. Column 18 ('Dmel.invLEN') is the length of the microinversion in D. melanogaster. Columns 19 and 20 ('Dpse.invSTARTREL' and 'Dpse.invSTOPREL') give the positions of the microinversion breakpoints relative to the start of the host D. pseudoobscura gene. Column 21 ('Dpse.invLEN') is the size of the microinversion in D. pseudoobscura.

Format: TXT Size: 3KB Download file

Open Data

Additional data file 5:

For all 121 microinversions identified by scanning within genes, the distance between each breakpoint and the closest flanking exon was extracted, and divided by the size of the host intron. The plots are histograms of the 121 weighted distances, considering the 5'- and 3'-breakpoints separately. To test the distances against a uniform distribution we used a two-sided Kolmogorov-Smirnov test, and the results are presented above the plots.

Format: PDF Size: 6KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional data file 6:

This file is a zipped archive of the sliding-window BLAST profiles (as PDFs) for all 5,738 orthologous gene pairs tested. The file is available at [50]. The name of the plot is the CG identifier for the gene. We stepped through D. melanogaster gene in 15 bp increments, and at each position BLASTed a 31 bp segment against the D. pseudoobscura ortholog. Each line represents a BLAST hit with a score above 45, the endpoints show the position of the hit in each genome, and the color of line represents the orientation of the hit (black = same sequence orientation in each genome, red = different orientations in each genome).

Format: TXT Size: 1KB Download file

Open Data