Genome Biology

official impact factor 6.89

Open Access Highly Access Research

High-resolution aCGH and expression profiling identifies a novel genomic subtype of ER negative breast cancer

Suet F Chin1, Andrew E Teschendorff1,2, John C Marioni4,2, Yanzhong Wang1, Nuno L Barbosa-Morais2, Natalie P Thorne4,2, Jose L Costa7, Sarah E Pinder6, Mark A van de Wiel9,8, Andrew R Green5, Ian O Ellis5, Peggy L Porter10, Simon Tavaré4,2, James D Brenton3, Bauke Ylstra7 and Carlos Caldas1,6*

  • * Corresponding author: Carlos Caldas cc234@cam.ac.uk

  • † Equal contributors

Author Affiliations

1 Breast Cancer Functional Genomics, Cancer Research UK Cambridge Research Institute and Department of Oncology University of Cambridge, Li Ka-Shing Centre, Robinson Way, Cambridge CB2 0RE, UK

2 Computational Biology Group, Cancer Research UK Cambridge Research Institute and Department of Oncology University of Cambridge, Li Ka-Shing Centre, Robinson Way, Cambridge CB2 0RE, UK

3 Functional Genomics of Drug Resistance, Cancer Research UK Cambridge Research Institute and Department of Oncology University of Cambridge, Li Ka-Shing Centre, Robinson Way, Cambridge CB2 0RE, UK

4 Computational Biology Group, Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Centre for Mathematical Sciences, Wilberforce Road, Cambridge CB3 0WA, UK

5 Histopathology, Nottingham City Hospital NHS Trust and University of Nottingham, Nottingham NG5 1PB, UK

6 Cambridge Breast Unit, Addenbrookes Hospital, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge, UK

7 Department of Pathology, VU University Medical Center, PO Box 7057, 1007MB Amsterdam, The Netherlands

8 Department of Biostatistics, VU University Medical Center, PO Box 7057, 1007MB Amsterdam, The Netherlands

9 Department of Mathematics, Vrije Universiteit, Amsterdam, Netherlands

10 Division of Human Biology, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA

For all author emails, please log on.

Genome Biology 2007, 8:R215 doi:10.1186/gb-2007-8-10-r215

Published: 7 October 2007

Additional files

Additional data file 1:

Clinical table for the 171 breast tumors of the NCH cohort. Clinical factors are: ER, estrogen receptor status; GRADE, histological grade; NPI, Nottingham Prognostic Index; Size, tumor size (cm); VI, vascular invasion; Stage; Age; Cellularity, percentage of tumor cells; survival (time and event); DFI, disease free interval; RECC, recurrence(yes/no); TTLR, time to local recurrence; LR, local recurrence (yes/no); TTRR, time to regional recurrence; RR, regional recurrence (yes/no); TTDM, time to distant metastasis; DM, distant metastasis (yes/no); Endocrine, endocrine therapy received (yes/no); CT, chemotherapy received (yes/no).

Format: TXT Size: 17KB Download file

Open Data

Additional data file 2:

GII, defined as the fraction of genome altered, against cellularity for 171 breast tumors. A, GII before cellularity correction; B, GII after cellularity correction; C, for overlapping altered regions as determined by BAC and oligo arrays, we plot the p value of the one-sided Fisher-exact test evaluating the concordance of altered/unchanged states between BAC and oligo arrays (a low p value is representative of high concordance).

Format: PDF Size: 29KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional data file 3:

Tables listing the common regions of most frequent gain and loss (10% gain/loss threshold) for tumors and cell lines separately. Columns label the index of the CRA, the genes (symbols) annotated to this region, the corresponding cytoband, start and end positions (Mb), length (Mb), number of mapped oligos in the region, frequency of gain, minimal region (1, yes; 0, no), number of oligos within this region and on the Agilent array with available expression data, fraction of these showing statistically significant coordinate aberrant expression, the corresponding best and worst p values, the genes showing significant correlation between copy number change and expression, and number of amplified cases.

Format: XLS Size: 2.9MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional data file 4:

Cluster stability analysis of the hierarchical clustering over the 1063 merged regions and 171 breast tumor samples, using the R-package pvclust. The red numbers indicate the robustness index of the clusters. There was only one cluster with a robustness index larger than 90% and containing more than 20 samples.

Format: PDF Size: 41KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional data file 5:

The centroid of expression for the low-GII subgroup identified in Figure 2. Genes were mean centered and standardized to unit variance. The top 37 genes discriminating between the low-GII subgroup and the rest of the samples are shown ranked from top to bottom together with their direction of differential expression.

Format: PDF Size: 10KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional data file 6:

The expression classifier for the low-GII subgroup in four independent external breast cancer cohorts: A, van de Vijver et al. [35]; B, Wang et al. [34]; C, Sotiriou et al. [36]; and D, CAL [6]. Samples in the predicted putative low-GII subgroup are labeled in pink, the rest of the samples are shown in orange. The top color bar denotes ER status (black, ER-; gray, ER+). In the heatmaps, red denotes relative overexpression and green denotes relative underexpression.

Format: PDF Size: 1.1MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional data file 7:

For each chromosome we plot (i) the frequency of gain (green) and loss (red) profiles of CRA over the 171 tumors, and (ii) the p values (log10 scale) of Agilent probes in these regions that evaluate the association between copy number gain and overexpression (green), or loss and underexpression (red). Threshold lines of 5% gain and 5% loss and 0.05 significance level are also shown (black).

Format: PDF Size: 980KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional data file 8:

Subset tables of ADF-3 (tumors only) listing the CRA that showed significant association between copy number gain and overexpression or copy number loss and underexpression at a significance level p < 0.05.

Format: XLS Size: 718KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional data file 9:

Subset tables of ADF-8 listing the CRA that showed strong statistical association ('hotspots') between either copy number gain and overexpression or copy number loss and underexpression. The criterion used for a region to be a hotspot was a best p value with p < 0.001 and at least 50% of oligos in the region showing an association between copy number state and aberrant expression at p < 0.05.

Format: XLS Size: 116KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional data file 10:

Frequency of gains (green) and loss (red) for the most frequently altered CAN genes.

Format: PDF Size: 18KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional data file 11:

Tables of CAN genes, 'kinome' genes, 'phosphatome' genes and 'chromatinome' genes frequently altered across breast tumors and also showing coordinate aberrant expression.

Format: PDF Size: 31KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional data file 12:

GII is plotted against tumor size for the Nottingham City Hospital cohort profiled in this study (NCH) (red), the cohort from California (CAL) profiled in [6] (black) and the cohort (Porter) profiled in [11] (pink). Samples in the NCH cohort that clustered in the 26-sample low-GII subgroup are shown in green.

Format: PDF Size: 23KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data