Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Highly Accessed Research

FitSNPs: highly differentially expressed genes are more likely to have variants associated with disease

Rong Chen123, Alex A Morgan123, Joel Dudley123, Tarangini Deshpande4, Li Li2, Keiichi Kodama123, Annie P Chiang123 and Atul J Butte123*

Author Affiliations

1 Stanford Center for Biomedical Informatics Research, 251 Cmpus Drive, Stanford, CA 94305, USA

2 Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA

3 Lucile Packard Children's Hospital, 725 Welch Road, Palo Alto, CA 94304, USA

4 NuMedii Inc., Menlo Park, CA 94025, USA

For all author emails, please log on.

Genome Biology 2008, 9:R170  doi:10.1186/gb-2008-9-12-r170

Published: 5 December 2008

Abstract

Background

Candidate single nucleotide polymorphisms (SNPs) from genome-wide association studies (GWASs) were often selected for validation based on their functional annotation, which was inadequate and biased. We propose to use the more than 200,000 microarray studies in the Gene Expression Omnibus to systematically prioritize candidate SNPs from GWASs.

Results

We analyzed all human microarray studies from the Gene Expression Omnibus, and calculated the observed frequency of differential expression, which we called differential expression ratio, for every human gene. Analysis conducted in a comprehensive list of curated disease genes revealed a positive association between differential expression ratio values and the likelihood of harboring disease-associated variants. By considering highly differentially expressed genes, we were able to rediscover disease genes with 79% specificity and 37% sensitivity. We successfully distinguished true disease genes from false positives in multiple GWASs for multiple diseases. We then derived a list of functionally interpolating SNPs (fitSNPs) to analyze the top seven loci of Wellcome Trust Case Control Consortium type 1 diabetes mellitus GWASs, rediscovered all type 1 diabetes mellitus genes, and predicted a novel gene (KIAA1109) for an unexplained locus 4q27. We suggest that fitSNPs would work equally well for both Mendelian and complex diseases (being more effective for cancer) and proposed candidate genes to sequence for their association with 597 syndromes with unknown molecular basis.

Conclusions

Our study demonstrates that highly differentially expressed genes are more likely to harbor disease-associated DNA variants. FitSNPs can serve as an effective tool to systematically prioritize candidate SNPs from GWASs.