<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art><ui>gb-2011-12-2-r14</ui><ji>GBJ</ji><fm>
<dochead>Research</dochead>
<bibl>
<title>
<p>Bringing order to protein disorder through comparative genomics and genetic interactions</p>
</title>
<aug>
<au id="A1" ce="yes"><snm>Bellay</snm><fnm>Jeremy</fnm><insr iid="I1"/><email>bellay@cs.umn.edu</email></au>
<au id="A2" ce="yes"><snm>Han</snm><fnm>Sangjo</fnm><insr iid="I2"/><insr iid="I3"/><email>sangjo.han@gmail.com</email></au>
<au id="A3" ce="yes"><snm>Michaut</snm><fnm>Magali</fnm><insr iid="I2"/><insr iid="I3"/><email>magali.michaut@utoronto.ca</email></au>
<au id="A4"><snm>Kim</snm><fnm>TaeHyung</fnm><insr iid="I2"/><insr iid="I3"/><email>taehyung.kim@utoronto.ca</email></au>
<au id="A5"><snm>Costanzo</snm><fnm>Michael</fnm><insr iid="I2"/><insr iid="I3"/><email>michael.costanzo@utoronto.ca</email></au>
<au id="A6"><snm>Andrews</snm><mi>J</mi><fnm>Brenda</fnm><insr iid="I2"/><insr iid="I3"/><insr iid="I4"/><email>brenda.andrews@utoronto.ca</email></au>
<au id="A7"><snm>Boone</snm><fnm>Charles</fnm><insr iid="I2"/><insr iid="I3"/><insr iid="I4"/><email>charlie.boone@utoronto.ca</email></au>
<au id="A8"><snm>Bader</snm><mi>D</mi><fnm>Gary</fnm><insr iid="I2"/><insr iid="I3"/><insr iid="I4"/><insr iid="I5"/><email>gary.bader@utoronto.ca</email></au>
<au ca="yes" id="A9"><snm>Myers</snm><mi>L</mi><fnm>Chad</fnm><insr iid="I1"/><email>cmyers@cs.umn.edu</email></au>
<au ca="yes" id="A10"><snm>Kim</snm><mi>M</mi><fnm>Philip</fnm><insr iid="I2"/><insr iid="I3"/><insr iid="I4"/><insr iid="I5"/><email>pm.kim@utoronto.ca</email></au>
</aug>
<insg>
<ins id="I1"><p>Department of Computer Science and Engineering, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455, USA</p></ins>
<ins id="I2"><p>The Donnelly Centre, University of Toronto, 160 College Street, Toronto, ON M5S 3E1, Canada</p></ins>
<ins id="I3"><p>Banting and Best Department of Medical Research, University of Toronto, 160 College Street, Toronto, ON M5S 3E1, Canada</p></ins>
<ins id="I4"><p>Department of Molecular Genetics, University of Toronto, 160 College Street, Toronto, ON M5S 3E1, Canada</p></ins>
<ins id="I5"><p>Department of Computer Science, University of Toronto, 160 College Street, Toronto, ON M5S 3E1, Canada</p></ins>
</insg>
<source>Genome Biology</source>
<issn>1465-6906</issn>
<pubdate>2011</pubdate>
<volume>12</volume>
<issue>2</issue>
<fpage>R14</fpage>
<url>http://genomebiology.com/2011/12/2/R14</url>
<xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2011-12-2-r14</pubid><pubid idtype="pmpid">21324131</pubid></pubidlist></xrefbib>
</bibl>
<history><rec><date><day>29</day><month>9</month><year>2010</year></date></rec><revrec><date><day>1</day><month>2</month><year>2011</year></date></revrec><acc><date><day>16</day><month>2</month><year>2011</year></date></acc><pub><date><day>16</day><month>2</month><year>2011</year></date></pub></history>
<cpyrt><year>2011</year><collab>Bellay et al.; licensee BioMed Central Ltd.</collab><note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec>
<st>
<p>Abstract</p>
</st>
<sec>
<st>
<p>Background</p>
</st>
<p>Intrinsically disordered regions are widespread, especially in proteomes of higher eukaryotes. Recently, protein disorder has been associated with a wide variety of cellular processes and has been implicated in several human diseases. Despite its apparent functional importance, the sheer range of different roles played by protein disorder often makes its exact contribution difficult to interpret.</p>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>We attempt to better understand the different roles of disorder using a novel analysis that leverages both comparative genomics and genetic interactions. Strikingly, we find that disorder can be partitioned into three biologically distinct phenomena: regions where disorder is conserved but with quickly evolving amino acid sequences (flexible disorder); regions of conserved disorder with also highly conserved amino acid sequences (constrained disorder); and, lastly, non-conserved disorder. Flexible disorder bears many of the characteristics commonly attributed to disorder and is associated with signaling pathways and multi-functionality. Conversely, constrained disorder has markedly different functional attributes and is involved in RNA binding and protein chaperones. Finally, non-conserved disorder lacks clear functional hallmarks based on our analysis.</p>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>Our new perspective on protein disorder clarifies a variety of previous results by putting them into a systematic framework. Moreover, the clear and distinct functional association of flexible and constrained disorder will allow for new approaches and more specific algorithms for disorder detection in a functional context. Finally, in flexible disordered regions, we demonstrate clear evolutionary selection of protein disorder with little selection on primary structure, which has important implications for sequence-based studies of protein structure and evolution.</p>
</sec>
</sec>
</abs>
</fm><meta>
<classifications>
<classification id="300100010" subtype="man_spc_id" type="BMC">Genome studies</classification>
<classification id="30010002" subtype="man_spc_id" type="BMC">Bioinformatics</classification>
<classification id="30010008" subtype="man_spc_id" type="BMC">Evolution</classification>
</classifications>
</meta><bdy>
<sec>
<st>
<p>Background</p>
</st>
<p>Many proteins include extended regions that do not fold into a native fixed conformation. These are referred to as being intrinsically unstructured or disordered. A possible utility of such regions was first suggested over 70 years ago by Linus Pauling, who speculated that their flexibility aids in antibody creation <abbrgrp>
<abbr bid="B1">1</abbr>
</abbrgrp>. Recent advances in computational prediction of disordered regions in amino acid sequences have greatly expanded our awareness of the widespread occurrence of disordered regions and the number of proteins whose structure is dominated by such regions (intrinsically disordered proteins or IDPs). Interestingly, protein disorder is more prevalent in complex organisms, accounting for 33% of the residues in the human proteome, but only a few percent of residues in <it>Escherichia coli</it>, suggesting it may play a major role in the evolution of complexity <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp>.</p>
<p>Protein disorder is a diverse and complex phenomenon. On a biophysical level, there exists a continuum of structure and disorder in the proteome. At one extreme, there are proteins that are almost entirely unstructured and natively form a coil; some may fold upon binding a ligand, and thereby undergoing a disorder to structure transition. Other proteins that are structurally more constrained, but still considered disordered, adopt a molten globule conformation <abbrgrp>
<abbr bid="B3">3</abbr>
</abbrgrp>. Highly structured proteins, which conform to the classical model of protein structure, occupy the other extreme on this spectrum, but even they often possess locally disordered regions <abbrgrp>
<abbr bid="B3">3</abbr>
</abbrgrp>. On a functional level, there are numerous and varied roles with which IDPs have been associated, including signaling, cellular regulation, nuclear localization, chaperone activity, RNA and DNA binding, protein binding and dosage sensitivity <abbrgrp>
<abbr bid="B4">4</abbr>
<abbr bid="B5">5</abbr>
</abbrgrp>, antibody creation <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>, and splicing <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>. Also, IDPs have been implicated in a variety of diseases, including cancer <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp>, and neurodegenerative and cardiovascular diseases <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>.</p>
<p>While the importance and widespread occurrence of IDPs is undisputed, a mechanistic understanding of the specific structural and functional roles of disorder is still lacking. Here, we systematically analyze and structure the different functions of disorder through the use of genetic interactions (GIs) and comparative genomics. We use two different, but related, concepts to partition disordered regions into three categories. Our analysis partitions what is currently only generally characterized as 'disorder' into several fundamentally different phenomena with distinct properties and functions.</p>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<sec>
<st>
<p>Genetic interaction hubs tend to have more disordered residues</p>
</st>
<p>Despite the apparent importance of disorder in mediating important protein functions <abbrgrp>
<abbr bid="B4">4</abbr>
</abbrgrp>, our knowledge is still limited in terms of its specific functional roles. The yeast GI network offers a new opportunity for global insights into the role of disorder in protein function <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>. Briefly, GIs are defined as pairs of genes whose combined mutation or deletion leads to an unexpected double mutant phenotype. Here we limit our attention to negative interactions; these are interactions in which the double mutant is significantly less fit than would be predicted by the fitnesses of the single mutants. Interestingly, it has been observed that the number of GIs of a gene (GI degree) is correlated with the percentage of disordered regions in the gene product <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp> (Figure <figr fid="F1">1a</figr>). GI degree is also correlated with different measures of multi-functionality (number of gene ontology (GO) annotations, phenotypic capacitance <abbrgrp>
<abbr bid="B10">10</abbr>
</abbrgrp> and chemical-genetic sensitivity <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp>), suggesting that the presence of disordered regions may underlie the highly pleiotropic roles of some proteins.</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Genetic interactions distinguish different roles of disorder</p></caption><text>
   <p><b>Genetic interactions distinguish different roles of disorder</b>. <b>(a) </b>Percentage of disordered residues of yeast proteins by their number of GIs. <b>(b) </b>Multi-functionality (see Materials and methods) for disordered and structured GI hubs and non-hubs. Hubs are genes in the top 90th percentile (above 90 interactions) of GIs while non-hubs are in the bottom 50th percentile (below 15 interactions). <b>(c) </b>Evolutionary constraint on sequence (dN/dS ratio) on hubs and non-hubs. In both cases disordered proteins have a significantly higher dN/dS than structured proteins. <b>(d) </b>Evolutionary constraint measured by the presence of orthologs in other yeast species (phylogenetic persistence). While disordered non-hubs are less conserved than structured non-hubs, the disordered hubs are as conserved as structured hubs. <it>P</it>-values were computed with a Wilcoxon test, and error bars represent boot-strapped 95% confidence intervals.</p>
</text><graphic file="gb-2011-12-2-r14-1"/></fig>
<p>The relationship between disorder and multi-functionality appears to depend on whether a gene is a hub in the GI network (that is, the gene is associated with a large number of GIs). Specifically, within the set of the GI hubs (&gt; 90 percentile in GI degree), disorder of the gene product is a strong predictor of multi-functionality (r = 0.22, <it>P </it>&lt; 10<sup>-12</sup>; Figure <figr fid="F1">1b</figr>), suggesting it is able to distinguish highly functionally versatile GI hubs from genes with more limited functional roles that simply exhibit a large number of GIs. However, this trend is absent on the set of non-GI hubs (&lt; 50 percentile in GI degree) where there is no significant correlation between the amount of disorder and the number of annotated functions (r = -0.02, <it>P </it>&gt; 0.3). This stark difference suggests that disorder plays a highly functional role on the set of proteins that have many GIs while disorder outside these genes is either less functional or simply of a markedly different nature. A similar distinction can be observed for protein-protein interactions: disorder is significantly correlated with protein-protein interaction degree on GI hubs (r = 0.16, <it>P </it>&lt; 3 &#215; 10<sup>-3</sup>; Figure S1 in Additional file <supplr sid="S1">1</supplr>) while no such correlation holds on non-GI hubs (r = -0.01, <it>P </it>&gt; 0.5). Thus, the GI network appears to provide a clear means of defining a set of proteins where the disorder plays a key functional role.</p>
<suppl id="S1">
<title>
<p>Additional file 1</p>
</title>
<text>
<p>
<b>Supplemental figures and tables</b>. This text file contains Figures S1 to S25 and Table S1 with their associated legends.</p>
</text>
<file name="gb-2011-12-2-r14-S1.DOC">
   <p>Click here for file</p>
</file>
</suppl>
<p>Despite their seeming functional importance, disordered regions of proteins have previously been associated with swiftly evolving, less conserved sequences, presumably because of lower structural constraint <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp>. We were intrigued by this property because, in general, GI hubs exhibit significantly lower rates of evolution (for example, measured by the dN/dS ratio) and tend to be conserved more broadly across species <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>. Indeed, we found that even among GI hubs, disordered proteins have significantly elevated rates of evolution. This trend is consistent outside the hubs as well (Figure <figr fid="F1">1c</figr>). However, disordered GI hubs are just as conserved phylogenetically as measured by their appearance across the yeast clade (Figure <figr fid="F1">1d</figr>). Thus, while the amino acid sequences tend to evolve faster for disordered GI hubs, they appear to be as phylogenetically constrained at the gene level as other GI hubs. Interestingly, outside of GI hubs, this is not true: non-GI hubs that are disordered tend to be less conserved across the yeast clade compared to their structured counterparts (Figure <figr fid="F1">1d</figr>). These observations relating disordered proteins to the GI network raise an interesting paradox. While the presence of disordered regions appears to be directly connected to their importance in the genetic network, there appears to be little evolutionary sequence constraint on these regions.</p>
</sec>
<sec>
<st>
<p>Many disordered residues are conserved across species</p>
</st>
<p>The counter-intuitive evolutionary pressure on disordered proteins motivated us to undertake a comparative analysis of disordered regions across the yeast clade. We hypothesized that functionally important disordered regions, such as those present in GI hubs, would be conserved as disorder across species (that is, also disordered, even if the underlying amino acid sequence was different) independent of rate of evolution. We therefore assessed the conservation of disorder on the residue level, which was also recently addressed by Chen <it>et al</it>. <abbrgrp>
<abbr bid="B13">13</abbr>
<abbr bid="B14">14</abbr>
</abbrgrp>. Specifically, we predicted which residues were disordered for all <it>Saccharomyces cerevisiae </it>genes and their orthologs in the 23 species of the yeast clade using DISOPRED2 <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp>, an algorithm that has been shown to predict disordered regions reliably <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>. For each disordered residue, we defined a measure of conserved disorder as the percentage of orthologs in which that residue is disordered as well (Figure <figr fid="F2">2</figr>). We operationally define conserved disordered residues as those with greater than 50% of disorder conservation.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Two forms of conservation on disorder</p></caption><text>
   <p><b>Two forms of conservation on disorder</b>. Schematic of computing disorder conservation and amino acid (AA) sequence conservation. After alignment, the percentage of sequences in which a residue is disordered is computed. Similarly, we compute the percentage of sequences in which the amino acid itself is conserved. A residue is considered to be conserved disorder if the property of disorder is conserved in &#8805; 50% of species and sequentially conserved if the amino acid is conserved in &#8805; 50% of species. Disordered residues in which both sequence and disorder are conserved are referred to as <it>constrained disorder</it>. Disordered residues in which disorder is conserved but not the amino acid sequence are referred to as <it>flexible disorder</it>. Residues which are disordered in <it>S. Cerevisiae </it>but not cases of conserved disorder are referred to as <it>non-conserved disorder</it>.</p>
</text><graphic file="gb-2011-12-2-r14-2"/></fig>
<p>Consistent with the general observations by Chen and co-workers <abbrgrp>
<abbr bid="B13">13</abbr>
<abbr bid="B14">14</abbr>
</abbrgrp>, we found that there is a surprisingly high rate of conservation of disordered regions: over 50% of disordered regions are conserved through 90% of the orthologs considered. Notably, disorder is conserved in many regions even where the specific amino acids are not conserved in the same regions, which explains the elevated dN/dS that has been previously associated with disorder <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp> (Figure <figr fid="F2">2</figr>). However, consistent with the stability of disorder across the yeast clade, we find that changes of amino acids in disordered regions are biased towards hydrophilic residues associated with disordered regions and away from hydrophobic residues (Figure S2 in Additional file <supplr sid="S1">1</supplr>). This result suggests that, despite a high evolutionary rate at the sequence level, there is substantial evolutionary pressure to keep these regions disordered.</p>
</sec>
<sec>
<st>
<p>Disorder can be systematically classified</p>
</st>
<p>Regions in which disorder is highly conserved across the yeast clade exhibit a wide range of amino acid conservation rates (Figure <figr fid="F3">3</figr>). We reasoned that the degree of constraint on the precise underlying sequence (as opposed to the more general property of disorder) might highlight distinct subclasses of functional disorder. To test this hypothesis, we divided conserved disordered regions into those where the underlying amino acid sequence is also conserved ('constrained disorder'), and the regions where there appears to be selection on the structural property of disorder itself rather than the specific sequence ('flexible disorder'; Materials and methods; Figure <figr fid="F2">2</figr>). Disordered residues that were not conserved across the yeast clade were considered as a separate, third class ('non-conserved disorder'; Figure S3 in Additional file <supplr sid="S1">1</supplr>). It is important to note that these results do not depend on the disorder predictor algorithm and core results were qualitatively replicated using DisEMBL <abbrgrp>
<abbr bid="B16">16</abbr>
</abbrgrp> instead of DISOPRED2 (Figure S4 in Additional file <supplr sid="S1">1</supplr>). Furthermore, the three classes also appear to be robust to various perturbations of the particular parameter choices of the method (Figures S5, S6, S7, and S8 in Additional file <supplr sid="S1">1</supplr>). In addition, flexible disorder was more robust to random simulated mutations (Figure S9 in Additional file <supplr sid="S1">1</supplr>), which is notable given the general fragility of disorder to mutation reported by <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>.</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>Densities of disorder- and amino acid-conserved residues by their scores</p></caption><text>
   <p><b>Densities of disorder- and amino acid-conserved residues by their scores</b>. Densities of disorder and amino acid conservation scores across all alignments of approximately 5,000 orthologous groups from 23 yeast species. <b>(a) </b>Histogram of the amino acid (AA) conservation scores. <b>(b) </b>Histogram of disorder conservation scores. <b>(c) </b>Two-dimensional histogram of both amino acid and disorder conservation scores.</p>
</text><graphic file="gb-2011-12-2-r14-3"/></fig>
<p>The three classes of disorder exhibit widely different properties (Figure <figr fid="F2">2b</figr>). First, while disorder is generally thought to be important in proteins with regulatory and signaling functions, we find that this is true only for flexible disorder. For instance, proteins enriched in flexible disorder have high phenotypic capacitance and are multifunctional. Moreover, they exhibit low-expression coherence, that is, are connectors in the cellular network, consistent with a regulatory role <abbrgrp>
<abbr bid="B18">18</abbr>
</abbrgrp>. Finally, flexible disorder is highly correlated with occurrence of linear motifs and GI degree, also consistent with signaling or regulatory roles. The respective associations for all the above properties with either constrained or non-conserved disorder are much weaker and, in most cases, not significant, suggesting that the regulatory properties of disorder are best captured by flexible disorder. Secondly, disordered proteins have recently been found to be expressed at a low level and have tightly controlled expression <abbrgrp>
<abbr bid="B4">4</abbr>
</abbrgrp>. We find this only true for proteins enriched in flexible disorder: flexible disorder is negatively correlated with gene expression level, while constrained disorder shows either a positive or no correlation depending on the inclusion of ribosomal proteins (Figure <figr fid="F4">4</figr>; Figure S7 in Additional file <supplr sid="S1">1</supplr>). Also, while genes enriched in non-conserved disorder appear to be expressed at a low level, there appears no evidence for tighter expression control as measured by half-life. Thirdly, a recent study found disordered proteins to exhibit high dosage sensitivity <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. We again find that this is a hallmark of flexible disorder (Figure <figr fid="F4">4</figr>), whereas constrained disorder is only weakly associated with this property. Non-conserved disorder shows little or much weaker association with most of these features, suggesting that the functional hallmarks of this class are less obvious. Indeed, we find that proteins enriched for non-conserved disorder have less confident disorder as scored by DISOPRED2 (Figure S10 in Additional file <supplr sid="S1">1</supplr>). However, our inability to identify functional roles for non-conserved disorder does not preclude the possibility of its functionality.</p>
<fig id="F4"><title><p>Figure 4</p></title><caption><p>Properties associated with types of disorder</p></caption><text>
   <p><b>Properties associated with types of disorder</b>. Correlation coefficients of different genomic features with percent constrained disorder, percent flexible disorder and percent non-conserved disorder. Error bars represent 95% confidence intervals.</p>
</text><graphic file="gb-2011-12-2-r14-4"/></fig>
<p>Because of their recognized importance for signaling pathways, we next turned our attention towards phosphosites and linear motifs. It has been noted previously that phosphosites and other recognized linear motifs often appear in disordered regions of proteins <abbrgrp>
<abbr bid="B19">19</abbr>
</abbrgrp>. As these motifs are crucial for signaling pathways, their occurrence in these regions certainly has strong functional consequences. In a detailed analysis at the residue level, we find that disorder conservation is strongly correlated with the placement of phosphosites (Figure <figr fid="F5">5a</figr>). In particular, we find that the relative density of phosphosites increases dramatically for residues with higher disorder conservation (Figure <figr fid="F5">5b</figr>). Conversely, the correlation of phosphosite density with amino acid conservation is weak (Figure <figr fid="F5">5c</figr>). Likewise, we find similar results for linear motif placement (Figure S11 in Additional file <supplr sid="S1">1</supplr>). In both cases, the partial correlation with conserved disorder, when controlling for amino acid conservation, remains strong, while the partial correlation between amino acid conservation and phosphosite or linear motif density disappears when controlling for conserved disorder. Conversely, neither linear motifs nor phosphosites show enrichment in residues that exhibit non-conserved disorder, which suggests that non-conserved disorder may not be functionally relevant in this context.</p>
<fig id="F5"><title><p>Figure 5</p></title><caption><p>Properties associated with types of disorder</p></caption><text>
   <p><b>Properties associated with types of disorder</b>. <b>(a) </b>Heatmap of enrichment (density over background) of phosphosites in terms of disorder and amino acid conservation. <b>(b) </b>Partial correlation of phosphosite density and disorder conservation with respect to amino acid conservation (see Materials and methods). <b>(c) </b>Partial correlation of phosphosite density and conserved amino acid sequence with respect to disorder conservation.</p>
</text><graphic file="gb-2011-12-2-r14-5"/></fig>
<p>Given our comparative genome-based classification of disorder, we revisited our earlier observation regarding the correlation between protein disorder and multi-functionality on GI hubs. As described earlier, we observed that within the set of the GI hubs (&gt; 90 percentile in GI degree), disorder of the gene product is a strong predictor of multi-functionality (r = 0.22, <it>P </it>&lt; 10<sup>-12</sup>; Figure <figr fid="F1">1b</figr>) while this trend does not hold on the set non-GI hubs (&lt; 50 percentile in GI degree). Thus, we reasoned that the disorder present in GI hubs may exhibit different abundances across our classes. Indeed, we did find evidence that disordered regions tend to be significantly more conserved among GI hubs than non-hubs (<it>P </it>&lt; 10<sup>-6</sup>; Figure S12 and Table S1 in Additional file <supplr sid="S1">1</supplr>). Furthermore, flexible disorder appears to account for the correlation between disorder and multi-functionality observed among the GI hubs since controlling for flexible disorder destroys the correlation (<it>P </it>&gt; 0.5), while a strong correlation is maintained when controlling for the level of constrained disorder (r = 0.15, <it>P </it>&lt; 0.01).</p>
<p>Interestingly, the set of highly disordered GI hubs is also significantly enriched for protein interaction hubs that bind temporally disparate partners (singlish interface hubs as defined in <abbrgrp>
<abbr bid="B20">20</abbr>
</abbrgrp>) when compared with disordered non-hubs or non-disordered hubs (<it>P </it>&lt; 10<sup>-5</sup>; Figure S13 in Additional file <supplr sid="S1">1</supplr>). In fact, the distinction between flexible and constrained disorder can be used to differentiate between singlish-interface hubs and the so-called multi-interface hubs, which typically bind their partners simultaneously (as defined in <abbrgrp>
<abbr bid="B20">20</abbr>
</abbrgrp>): singlish hubs have more flexible disorder than multi-interface hubs (<it>P </it>&lt; 10<sup>-13</sup>), while there is no significant difference in terms of constrained-disorder (<it>P </it>&gt; 0.1; Figure <figr fid="F6">6</figr>).</p>
<fig id="F6"><title><p>Figure 6</p></title><caption><p>Singlish and multi-interface hubs have different proportions of flexible and constrained disorder</p></caption><text>
   <p><b>Singlish and multi-interface hubs have different proportions of flexible and constrained disorder</b>. The mean proportion of flexible disorder and constrained disorder in singlish-interface and multi-interface protein interaction hubs. While both have a similar level of constrained disorder, singlish hubs are heavily enriched for flexible disorder. Error bars represent 95% confidence intervals.</p>
</text><graphic file="gb-2011-12-2-r14-6"/></fig>
</sec>
<sec>
<st>
<p>Flexible and constrained disorder show different functional associations</p>
</st>
<p>The above results indicate that flexible disorder and constrained disorder are markedly different phenomena based on a variety of physiological and phenotypic data. On the one hand, flexible disorder corresponds to what we refer to as 'classic disorder': these are intrinsically unstructured regions, which evolve rapidly and present short linear motifs to signaling domains or protein kinases. Flexible disorder is thus a central player in signaling, which is confirmed by a GO enrichment analysis - all top enriched terms are related to regulation, including transcription factors, chromatin modifiers, and signaling pathways and DNA binding proteins (Figure <figr fid="F7">7</figr>; Table S2 in Additional file <supplr sid="S2">2</supplr>).</p>
<fig id="F7"><title><p>Figure 7</p></title><caption><p>Disorder splits into three distinct phenomena</p></caption><text>
   <p><b>Disorder splits into three distinct phenomena</b>. Functional enrichment maps of proteins enriched in flexible disorder versus constrained disorder. The area of each rectangle is proportional to the representation of that type of disorder in the alignments. Related GO terms are grouped based on gene overlap (see Materials and methods; Figures S20, S21 and S22 in Additional file <supplr sid="S1">1</supplr>).</p>
</text><graphic file="gb-2011-12-2-r14-7"/></fig>
<suppl id="S2">
<title>
<p>Additional file 2</p>
</title>
<text>
<p>
<b>Functional enrichment</b>. This file contains three tables: a table of GO terms (function, process and component) that are enriched for flexible and constrained disorder, a table of enrichments for domains in regions of constrained disorder and a table of enrichments for domains in regions of non-conserved disorder.</p>
</text>
<file name="gb-2011-12-2-r14-S2.XLS">
   <p>Click here for file</p>
</file>
</suppl>
<p>In contrast, proteins with a high level of constrained disorder exhibit dramatically different functional characteristics. Constrained disordered proteins are enriched in genes involved in ribosome biogenesis or function, RNA binding and protein chaperone activity (Figure <figr fid="F7">7</figr>; Table S2 in Additional file <supplr sid="S2">2</supplr>). Some of these functions have been previously associated with conserved disorder <abbrgrp>
<abbr bid="B14">14</abbr>
</abbrgrp>, but our analysis suggests they are even more specifically associated with regions that are under tight sequence constraint, which is not generally true of regions that have properties characteristic of 'classic' disorder.</p>
<p>Given the dichotomy in functions arising from the presence or lack of sequence constraint, we explored the positions of these regions with respect to predicted domains. We find that flexible disordered residues rarely reside inside structured domains, consistent with the idea that they would localize to loops to present highly flexible linear motifs to their signaling partners. Conversely, constrained disordered residues lie within domains significantly more frequently than flexible residues, though occurring well below the level of the genomic background (Figures S14 and S15 in Additional file <supplr sid="S1">1</supplr>). The particular domains in which constrained disorder residues are enriched confirmed the location of these regions within RNA-binding ribosomal proteins and protein chaperones (GroEL-like chaperone, ATPase, Translation protein SH3-like, AAA ATPase, core; Table S3 in Additional file <supplr sid="S2">2</supplr>).</p>
<p>The highly distinct functional and positional characteristics associated with these two classes of disorder suggest that they are very different phenomena. On the one hand, flexible disorder is closest to what is canonically understood as protein disorder, that is, these are structurally flexible, fast evolving sequences with involvement in signaling. A good example of flexible disorder is found in the serine-arginine protein kinase Sky1 (YMR216C), similar to human SRPK1, which regulates proteins involved in mRNA metabolism and cation homeostasis. The region containing residues 712-737, conserved for disorder across orthologs but not sequence, is located at the end of the kinase (Figure S16 in Additional file <supplr sid="S1">1</supplr>). This carboxy-terminal disordered loop interacts with the activation loop of the kinase <abbrgrp>
<abbr bid="B21">21</abbr>
</abbrgrp> and is likely involved in the regulation of kinase activity. Likewise, the corresponding region exhibits flexible disorder in many of the related cyclin-dependent kinases <abbrgrp>
<abbr bid="B22">22</abbr>
</abbrgrp>. For example, in Bur1, this region contains flexible disorder and also harbors multiple phosphosites and linear motifs, underlining its importance in signaling (Figure S17 in Additional file <supplr sid="S1">1</supplr>).</p>
<p>On the other hand, our results suggest that constrained disorder can often adopt fixed conformation. As has been previously suggested, some disordered proteins are likely to undergo disorder-to-order transitions upon binding of their targets <abbrgrp>
<abbr bid="B3">3</abbr>
</abbrgrp>, and we speculate this is a hallmark of the constrained disorder class. In the case of ribosomal biogenesis and RNA-binding structural proteins, they become structured upon binding RNA. This imposes a high degree of local structural constraint on them, which results in elevated constraint on the actual amino acid sequence. For instance, in Rpl5 a region of constrained disorder can be observed immediately before an alpha helix that forms the carboxy-terminal end of the amino acid sequence (Figure S18 in Additional file <supplr sid="S1">1</supplr>). The role of this region was specifically investigated in <abbrgrp>
<abbr bid="B23">23</abbr>
</abbrgrp>, and they report strong evidence for a disorder-to-order transition of this region upon the binding of Rpl5 to 5S rRNA. We also found an enrichment for constrained disorder among protein chaperones, where disordered regions appear to be involved in the binding of client proteins. For example, the HSP90 heat shock protein (HSC82/HSP82) contains long regions of constrained disorder (Figure S19 in Additional file <supplr sid="S1">1</supplr>). In particular, the constrained disordered region from 590-600 is conserved throughout the bacterial kingdom, is localized at the inner surface of the barrel-shaped protein and has been directly implicated in the chaperone activity of this protein. It has been previously speculated that this disordered region may play a role in entropy transfer and the refolding of clients through a disorder-to-order transition <abbrgrp>
<abbr bid="B24">24</abbr>
</abbrgrp>. However, there is little direct experimental evidence about the precise role of disorder in chaperone function. We hypothesize that, in general, the tight sequence conservation of constrained disorder is required in regions that assume a structured conformation, even if this conformation is only assumed in a transient fashion as in the case of HSP90 or more permanently as in the case of Rpl5.</p>
</sec>
</sec>
<sec>
<st>
<p>Discussion</p>
</st>
<p>In this work, we show that protein disorder can be partitioned into three biophysically and biologically distinct phenomena. The first two, flexible and constrained disorder, capture different functional characteristics: flexible disorder appears to be strongly associated with signaling and regulation while constrained disorder is associated with chaperones and ribosomal proteins. Flexible disorder appears to be largely responsible for many of the characteristics traditionally associated with disordered regions. On the other hand, non-conserved disorder does not seem to have obvious functional hallmarks by our analysis. While we discovered these categories using a comparative genomics approach that exploits evolutionary signatures, they ultimately are likely to correspond to biophysically different phenomena. In a similar fashion, modern secondary prediction methods make use of evolutionary information in the form of sequence profiles, while they discover biophysical properties.</p>
<p>Several classification schemes for protein disorder have been described in previous studies, including categorizations based on structural descriptions <abbrgrp>
<abbr bid="B3">3</abbr>
<abbr bid="B25">25</abbr>
</abbrgrp>, molecular function <abbrgrp>
<abbr bid="B26">26</abbr>
</abbrgrp>, or data-driven unsupervised partitions <abbrgrp>
<abbr bid="B27">27</abbr>
</abbrgrp>. In particular, the functional characterization put forth in <abbrgrp>
<abbr bid="B26">26</abbr>
</abbrgrp> (Figure S24 in Additional file <supplr sid="S1">1</supplr>) has an interesting overlap with the flexible and constrained categories defined here. Tompa <abbrgrp>
<abbr bid="B26">26</abbr>
</abbrgrp> first makes a distinction between proteins whose disordered regions perform a purely mechanical function (for example, entropic chains) from those that have the capacity to bind other proteins or small molecules (recognition). A similar division is made by <abbrgrp>
<abbr bid="B25">25</abbr>
</abbrgrp> between disordered regions that can at least transiently fold ('folders') from regions that never fold ('unfolders'). There the authors claim that entropic chains are necessarily unfolders, while recognition regions are necessarily folding regions. The yeast nucleoporin <it>NUP2</it>, a canonical example of entropic chains, appears to contain long regions of flexible disorder. In fact, 22% of its residues are cases of flexible disorder (the background rate is 9%) while only 12% is constrained disorder (the background rate is 7%). This is consistent with the fact that the role of such regions does not require strict residue conservation and it is tempting to speculate that other entropic chains are also cases of flexible disorder.</p>
<p>Despite some evidence that flexible disordered regions as defined here may correspond to entropic chains, the previously defined category of recognition proteins (folders) appears to contain clear cases of both flexible and constrained disorder. In particular, the subcategory of 'display sites' seems to correspond to our notion of flexible disorder, given its enrichment for linear motifs and association with signaling proteins. These appear to be cases of a relatively short recognition motif contained in a longer disordered region <abbrgrp>
<abbr bid="B28">28</abbr>
</abbrgrp>, and it has been previously observed that, while functional recognition motifs are well conserved, the surrounding disordered region may evolve quickly <abbrgrp>
<abbr bid="B29">29</abbr>
</abbrgrp>. Thus, these regions appear to consist primarily of flexible disorder since only the motif is conserved while the surrounding disordered region is under less selective constraint and is presumably important in facilitating the promiscuous binding required for signaling proteins.</p>
<p>Another class of proteins associated with promiscuous protein binding, chaperone proteins, is clearly enriched for constrained disorder. While the importance of disordered regions in the functioning of chaperones is well established (for example, <abbrgrp>
<abbr bid="B30">30</abbr>
<abbr bid="B31">31</abbr>
</abbrgrp>), the role played by disordered regions in chaperones is still the subject of active investigation <abbrgrp>
<abbr bid="B32">32</abbr>
</abbrgrp>. There are a number of hypotheses regarding the roles of disorder in protein chaperones, including the idea that disordered chaperones may directly or indirectly stabilize client proteins due to their high hydrophilicity, or the notion that disordered chaperones may help in shielding unfolded proteins from interactions with other molecules, and the aforementioned entropy transfer hypothesis (see <abbrgrp>
<abbr bid="B32">32</abbr>
</abbrgrp> for a comprehensive review). Our study suggests that, regardless of the precise function of the disordered regions in chaperones, it differs from the role that disorder plays in signaling proteins.</p>
<p>Finally, the other major category of recognition proteins, 'permanent binding', appears to, at least in part, be populated by regions of constrained disorder. This is supported by the enrichment for ribosomal proteins that are known to fold upon binding other ribosomal proteins and rRNA. Again, we suspect that cases where disordered regions fold permanently upon binding other molecules will be enriched for constrained disorder due to increased selective pressure required to maintain a stable bond.</p>
<p>Another classification scheme for disordered regions was put forth in <abbrgrp>
<abbr bid="B27">27</abbr>
</abbrgrp> based on an unsupervised, data-driven partitioning of 145 disordered proteins, which identified three 'flavors' of disorder. The group of proteins described as 'flavor V' is highly enriched for ribosomal proteins and resembles the enrichments of constrained disorder defined here, while 'flavor S' was highly enriched for protein binding functions similar to regions of flexible disorder. However, these categories only weakly resemble the flexible and constrained disorder defined here as evidenced by their apparently distinct amino acid distributions (compare Figure <figr fid="F1">1</figr> of <abbrgrp>
<abbr bid="B27">27</abbr>
</abbrgrp> and Figure S25 in Additional file <supplr sid="S1">1</supplr>). These differences may stem from the fact that the previous classification scheme used an unsupervised algorithm and a limited set of proteins, and, most importantly, trained on whole proteins. In other words, it assumed that all disorder in one protein is of the same category, an assumption we are not making.</p>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>In this work, we show that protein disorder can be partitioned into three biophysically and biologically distinct phenomena. The first two, flexible ('classic') and constrained disorder, capture different functional characteristics. On the other hand, non-conserved disorder does not seem to have functional roles. Our results have wide-ranging consequences for the prediction of disordered regions and for the functional interpretation of disordered regions in cellular networks. Future experimental work may confirm the distinct biophysical properties of constrained and flexible disorder we are predicting here. Importantly, our analysis framework allows for much more detailed functional interpretations of disordered regions. Finally, our new categories of disorder will help in the refinement of disorder prediction algorithms.</p>
</sec>
<sec>
<st>
<p>Materials and methods</p>
</st>
<sec>
<st>
<p>Description of gene/protein level features and correlation analysis</p>
</st>
<p>Throughout this paper, correlations were done using Pearson's correlation coefficient <abbrgrp>
<abbr bid="B33">33</abbr>
</abbrgrp> and calculated using Matlab's <it>corrcoef </it>function. Error bounds are the 95% confidence interval.</p>
<p>In the following section, we describe the data sets used throughout to characterize aspects of disorder. Several of these features were previously described in <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>.</p>
<sec>
<st>
<p>Genetic interaction degree</p>
</st>
<p>This was the same measure as the negative GI degree in <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>. Specifically, it is the number of negative interactions each array gene has, where negative interactions are defined as those that have a score &#949; &lt; -0.08 and <it>P </it>&lt; 0.05.</p>
</sec>
<sec>
<st>
<p>Protein disorder</p>
</st>
<p>Protein disorder was derived using the software Disopred2 <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp>. We define structured proteins to be those with less than 10% disorder and disordered proteins to be those with greater than 30% disorder, following <abbrgrp>
<abbr bid="B4">4</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>dN/dS Ratio</p>
</st>
<p>We computed the average dN/dS ratio for <it>S. cerevisiae </it>in comparison to the yeast species (<it>Saccharomyces paradoxus</it>, <it>Saccharomyces bayanus </it>and <it>Saccharomyces mikatae</it>). Sequences were subsequently aligned using MUSCLE <abbrgrp>
<abbr bid="B34">34</abbr>
</abbrgrp> and dN/dS ratios were computed using PAML <abbrgrp>
<abbr bid="B35">35</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Expression level</p>
</st>
<p>The expression level of a gene as measured by the average number of mRNA copies of each transcript per cell were taken from <abbrgrp>
<abbr bid="B36">36</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Half-life</p>
</st>
<p>The half-life of a gene was the half-life of its mRNA measured in minutes and reported in <abbrgrp>
<abbr bid="B37">37</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Phenotypic capacitance</p>
</st>
<p>The phenotypic capacitance reflects the variability in a panel of phenotypes induced by deletion of non-essential genes and was used directly from the Levy and Siegal study <abbrgrp>
<abbr bid="B38">38</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Multi-functionality</p>
</st>
<p>This is simply the number of GO process annotations for each gene restricting to the functionally distinct set of GO terms described in <abbrgrp>
<abbr bid="B39">39</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Expression coherence score</p>
</st>
<p>This is the clustering coefficient calculated on the MEFIT <abbrgrp>
<abbr bid="B40">40</abbr>
</abbrgrp> combined network where edges are genes with a score higher than 2 (approximately 95th percentile). Let <it>E</it>(N<sub>i</sub>,N<sub>j</sub>) be 1 if there is an edge between N<sub>i </sub>and N<sub>j </sub>and zero otherwise. The clustering coefficient for a gene G with n neighbors {N<sub>i</sub>} is:</p>
<p>
<display-formula>
<graphic file="gb-2011-12-2-r14-i1.gif"/>
</display-formula>
</p>
</sec>
<sec>
<st>
<p>Linear motifs</p>
</st>
<p>Linear motifs were found using Scansite <abbrgrp>
<abbr bid="B41">41</abbr>
</abbrgrp> on the most stringent setting.</p>
</sec>
</sec>
<sec>
<st>
<p>Conserved disorder</p>
</st>
<sec>
<st>
<p>Defining conservation of disorder and sequence residues from the yeast clade</p>
</st>
<p>Each of 5,025 orthologous groups across 23 species in the yeast clade <abbrgrp>
<abbr bid="B42">42</abbr>
</abbrgrp> was multiple-aligned by MAFF <abbrgrp>
<abbr bid="B43">43</abbr>
</abbrgrp> with default parameters. Amino acid conservation scores (A) of each position in each alignment was calculated and binned as follows:</p>
<p>
<display-formula>
<graphic file="gb-2011-12-2-r14-i2.gif"/>
</display-formula>
</p>
<p>where <it>a</it>
<sub>
<it>i </it>
</sub>represents one of 20 different amino acid symbol indicator functions in an alignment position in <it>k</it>
<sup>
<it>th </it>
</sup>protein sequence, and <it>N </it>stands for total number of protein sequences aligned.</p>
<p>
<display-formula>
<graphic file="gb-2011-12-2-r14-i3.gif"/>
</display-formula>
</p>
<p>For disorder conservation score (D), each alignment position was overlaid with the disorder symbol predicted by Dispred2 <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp> with default parameters and its conservation was calculated and binned as follows:</p>
<p>
<display-formula>
<graphic file="gb-2011-12-2-r14-i4.gif"/>
</display-formula>
</p>
<p>where <it>d </it>represents the disorder indicator function in an alignment position in <it>k</it>
<sup>
<it>th </it>
</sup>protein sequence.</p>
<p>
<display-formula>
<graphic file="gb-2011-12-2-r14-i5.gif"/>
</display-formula>
</p>
<p>We only considered proteins for which at least ten orthologs were available, and residue positions where at least five of those orthologs were aligned. All orthologous sequence- and disorder-overlaid alignments are displayed with the Jalview applet <abbrgrp>
<abbr bid="B44">44</abbr>
</abbrgrp> and available at <abbrgrp>
<abbr bid="B45">45</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Structural conservation of disorder calculation</p>
</st>
<p>To calculate how disorder is structurally conserved despite changes at the amino acid level, we divided amino acids into a group associated with disorder and a group associated with structure (Table <tblr tid="T1">1</tblr>). This list was compiled based on <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp> where disordered amino acids are charged and hydrophilic while structural amino acids are neutral and therefore hydrophobic. We considered all positions of orthologs of <it>S. cerevisiae </it>genes after alignment. If the amino acid was changed from <it>S. cerevisiae </it>to an ortholog, we recorded if it was changed towards the disordered set of amino acids or the structured set of amino acids. Then we compared the amino acids in regions of conserved disorder, regions of non-conserved disorder and the background (all positions).</p>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>Description of the structural and disordered classes of amino acids</p></caption><tblbdy cols="2">
      <r>
         <c ca="left">
            <p>
               <b>Structural amino acids</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Disordered amino acids</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="2">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Cysteine</p>
         </c>
         <c ca="left">
            <p>Aspartic acid</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Tryptophan</p>
         </c>
         <c ca="left">
            <p>Methionine</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Tyrosine</p>
         </c>
         <c ca="left">
            <p>Lysine</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Isoleucine</p>
         </c>
         <c ca="left">
            <p>Arginine</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Phenylalanine</p>
         </c>
         <c ca="left">
            <p>Serine</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Valine</p>
         </c>
         <c ca="left">
            <p>Glutamine</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Leucine</p>
         </c>
         <c ca="left">
            <p>Proline</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Histidine</p>
         </c>
         <c ca="left">
            <p>Glutamic acid</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Threonine</p>
         </c>
         <c ca="left">
            <p>Alanine</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Asparagine</p>
         </c>
         <c ca="left">
            <p>Glycine</p>
         </c>
      </r>
   </tblbdy></tbl>
</sec>
</sec>
<sec>
<st>
<p>A systematic classification of disorder</p>
</st>
<sec>
<st>
<p>Definitions of constrained and flexible disorder</p>
</st>
<p>Conserved disorder: aligned positions that have D &#8805; 5, that is, are disordered in more than 50% of aligned residues.</p>
<p>Flexible disorder: aligned positions that have D &#8805; 5 and A &lt; 5, that is, are disordered in greater than or equal to 50% of aligned residues but are conserved in less than 50% of aligned residues.</p>
<p>Constrained disorder: aligned positions that have D &#8805; 5 and A &#8805; 5, that is, are disordered in greater than or equal to 50% of aligned residues and conserved in greater than or equal to 50% of aligned residues.</p>
<p>Non-conserved disorder: aligned positions that have D &lt; 5, that is, are disordered in <it>S. cerevisiae </it>but are disordered in less than 50% of aligned residues.</p>
</sec>
<sec>
<st>
<p>Distribution of residues in two conservation spaces: phosphorylation and linear motifs</p>
</st>
<p>Phosphorylation sites of <it>S. cerevisiae</it>, <it>Schizosaccharomyces pombe and Candida albicans </it>were obtained from <abbrgrp>
<abbr bid="B46">46</abbr>
</abbrgrp> and a compilation of phosphosite datasets <abbrgrp>
<abbr bid="B46">46</abbr>
<abbr bid="B47">47</abbr>
<abbr bid="B48">48</abbr>
<abbr bid="B49">49</abbr>
<abbr bid="B50">50</abbr>
<abbr bid="B51">51</abbr>
</abbrgrp>. Linear motif sites are predicted by ScansSite2.0 on the <it>S. cerevisiae </it>data. Each distribution of feature-residue-odds-ratio (<it>O</it>
<sup>
<it>feature</it>
</sup>, termed as relative density in the main text and Figure <figr fid="F5">5a</figr>) is calculated in a similar way as the hub-odds-ratio:</p>
<p>
<display-formula>
<graphic file="gb-2011-12-2-r14-i6.gif"/>
</display-formula>
</p>
<p>where <it>F</it>
<sub>
<it>ij </it>
</sub>represents the number of feature residues (that is, phosphorylation site) with <it>i</it>
<sup>
<it>th </it>
</sup>amino acid conservation score (A) and <it>j</it>
<sup>
<it>th </it>
</sup>disorder conservation score (D) in whole proteins of <it>S. cerevisiae</it>, <it>S. pombe</it>, <it>and C. albicans </it>in case of phosphorylation sites or <it>S. cerevisiae </it>in case of linear motifs.</p>
<p>Each distribution of phosphorylation-site-odds-ratio (P) and linear-motif-odds-ratio (M) is displayed with levelplot function in lattice R package <abbrgrp>
<abbr bid="B52">52</abbr>
</abbrgrp>. Partial correlations of <it>O</it>
<sup>
<it>feature </it>
</sup>and A (or <it>O</it>
<sup>
<it>feature </it>
</sup>and D) are statistically tested by pcor.test function <abbrgrp>
<abbr bid="B53">53</abbr>
</abbrgrp>, and plotted with residuals after controlling each other by linear regression.</p>
</sec>
<sec>
<st>
<p>Distribution of two conserved residues in hubs: GI, protein-protein interaction and structural interaction network</p>
</st>
<p>The 50th or 90th percentile hubs of GI and protein-protein interaction networks were defined as proteins with the degree greater than 50th or 90th percentile degree in the respective degree distributions. Singlish- and multi-interface hubs were defined as described in <abbrgrp>
<abbr bid="B20">20</abbr>
</abbrgrp> with the structural interaction network recently updated with <it>i</it>Pfam corresponding to Pfam release 21.0 <abbrgrp>
<abbr bid="B54">54</abbr>
</abbrgrp>, 2,295 yeast Protein Data Bank files <abbrgrp>
<abbr bid="B55">55</abbr>
</abbrgrp> and 82,650 physical interactions in Biogrid 2.06 <abbrgrp>
<abbr bid="B56">56</abbr>
</abbrgrp>.</p>
</sec>
</sec>
<sec>
<st>
<p>Function of flexible versus constrained disorder</p>
</st>
<sec>
<st>
<p>GO enrichments</p>
</st>
<p>We found GO term enrichments for disorder type (flexible, constrained and non-conserved disorder) using the following method. The distribution of disorder type for each GO term was tested against the background distribution of that disorder type using the Wilcoxon rank sum test for <it>P</it>-value &lt; 0.05, where the <it>P</it>-value was adjusted for multiple hypothesis testing using Benjamini-Hochberg false discovery correction. Terms enriched for either flexible or constrained disorder were only considered enriched if the distribution of ratios (Flexible/(Flexible + Constrained) or Constrained/(Flexible + Constrained), respectively) was significantly higher than the background for the term using a Rank sum test with <it>P </it>&lt; 0.01. Thus, a term that was reported as enriched for flexible disorder was not also enriched for constrained disorder. Similarly, terms that initially were enriched for non-conserved disorder were tested to see if the ratio (Non-conserved disorder)/(Total disorder) was above the background of the term using a Rank sum test with <it>P </it>&lt; 0.01. Enrichments for flexible and constrained disorder are contained in Additional file <supplr sid="S2">2</supplr>.</p>
</sec>
<sec>
<st>
<p>Domain analysis</p>
</st>
<p>To define domains, we used the <it>domains.tab </it>file downloaded from the Saccharomyces Genome Database on 4 April 2010, which contains the results of an InterProScan using each <it>S. cerevisiae </it>protein sequence to query for domains/motifs from several databases. The file consists of 40,737 domains mapped onto the yeast proteome. To restrict our analysis to structural domains, we only considered 11,801 domains mapped using three methods: superfamily (SCOP database, 4,943 domains), HMMPfam (Pfam database, 4,422 domains) and Gene3D (CATH database, 2,336 domains). For each of the 3,680 genes with mapped domains and alignments, every position in the sequence was associated with two conservation scores: conservation in disorder (D) and conservation in amino acids (A) obtained from the sequence alignments of the yeast clade (see above). For a given point in the conservation grid (A, D), we counted the residues that overlap with at least one domain and the residues that did not overlap with any domain. We then computed the log odds ratio of these counts.</p>
</sec>
<sec>
<st>
<p>Family analysis</p>
</st>
<p>For each domain, we computed the percentage of residues falling in each of the three categories: flexible disorder, constrained disorder, non-conserved disorder. We then compared the distributions of these percentages for all domains. We extracted the domains enriched in constrained disorder as opposed to flexible disorder by examining the ratio constrained/flexible (false discover rate (FDR)-adjusted Wilcoxon <it>P</it>-value &lt; 0.05). The tests were performed with the function <it>wilcox.test </it>and the <it>P</it>-values were corrected for multiple testing with the function <it>p.adjust(method = 'FDR') </it>from the statistical programming environment R <abbrgrp>
<abbr bid="B52">52</abbr>
</abbrgrp>. The results of these enrichments are contained in Additional file <supplr sid="S2">2</supplr>.</p>
</sec>
<sec>
<st>
<p>Enrichment map</p>
</st>
<p>Enrichment maps were created using Cytoscape <abbrgrp>
<abbr bid="B57">57</abbr>
</abbrgrp> and the Enrichment Map plugin <abbrgrp>
<abbr bid="B58">58</abbr>
</abbrgrp>. The edges represent the value of the overlap coefficient (size of the intersection of both GO terms/size of the small GO term) with a cutoff at 0.3.</p>
</sec>
</sec>
</sec>
<sec>
<st>
<p>Abbreviations</p>
</st>
<p>GI: genetic interaction; GO: gene ontology; IDP: intrinsically disordered proteins.</p>
</sec>
<sec>
<st>
<p>Authors' contributions</p>
</st>
<p>JB, CM and PK conceived the project. JB, SH, MM and TK designed and implemented the analysis. JB, SH, MM, MC, BA, CB, GB, CM and PK wrote the paper. CM and PK helped to develop the approach and supervised the research. All authors read and approved the final manuscript.</p>
</sec>
</bdy><bm>
<ack>
<sec>
<st>
<p>Acknowledgements</p>
</st>
<p>The authors would like to thank Dr Yu Brandon Xia, Dr Ben Turk, Joshua Baller and Elizabeth Koch for their valuable insights and comments regarding this work. This project was in part supported by a grant from the Natural Science and Engineering Research Council (386671)(PMK), and a grant from the National Research Foundation of Korea (KRF-2009-352-C00140)(SH). JB and CLM are partially supported by funding from the University of Minnesota Biomedical Informatics and Computational Biology program, a seed grant from the Minnesota Supercomputing Institute, the National Institutes of Health (1R01HG005084-01A1) and the National Science Foundation (DBI 0953881). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>A theory of the structure and process of formation of antibodies.</p></title><aug><au><snm>Pauling</snm><fnm>L</fnm></au></aug><source>J Am Chem Soc</source><pubdate>1940</pubdate><volume>62</volume><fpage>2643</fpage><lpage>2657</lpage><xrefbib><pubid idtype="doi">10.1021/ja01867a018</pubid></xrefbib></bibl><bibl id="B2"><title><p>Prediction and functional analysis of native disorder in proteins from the three kingdoms of life.</p></title><aug><au><snm>Ward</snm><fnm>JJ</fnm></au><au><snm>Sodhi</snm><fnm>JS</fnm></au><au><snm>McGuffin</snm><fnm>LJ</fnm></au><au><snm>Buxton</snm><fnm>BF</fnm></au><au><snm>Jones</snm><fnm>DT</fnm></au></aug><source>J Mol Biol</source><pubdate>2004</pubdate><volume>337</volume><fpage>635</fpage><lpage>645</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jmb.2004.02.002</pubid><pubid idtype="pmpid" link="fulltext">15019783</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Intrinsically unstructured proteins and their functions.</p></title><aug><au><snm>Dyson</snm><fnm>HJ</fnm></au><au><snm>Wright</snm><fnm>PE</fnm></au></aug><source>Nat Rev Mol Cell Biol</source><pubdate>2005</pubdate><volume>6</volume><fpage>197</fpage><lpage>208</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nrm1589</pubid><pubid idtype="pmpid" link="fulltext">15738986</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Tight regulation of unstructured proteins: from transcript synthesis to protein degradation.</p></title><aug><au><snm>Gsponer</snm><fnm>J</fnm></au><au><snm>Futschik</snm><fnm>ME</fnm></au><au><snm>Teichmann</snm><fnm>SA</fnm></au><au><snm>Babu</snm><fnm>MM</fnm></au></aug><source>Science</source><pubdate>2008</pubdate><volume>322</volume><fpage>1365</fpage><lpage>1368</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1163581</pubid><pubid idtype="pmcid">2803065</pubid><pubid idtype="pmpid">19039133</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Intrinsic protein disorder and interaction promiscuity are widely associated with dosage sensitivity.</p></title><aug><au><snm>Vavouri</snm><fnm>T</fnm></au><au><snm>Semple</snm><fnm>JI</fnm></au><au><snm>Garcia-Verdugo</snm><fnm>R</fnm></au><au><snm>Lehner</snm><fnm>B</fnm></au></aug><source>Cell</source><pubdate>2009</pubdate><volume>138</volume><fpage>198</fpage><lpage>208</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.cell.2009.04.029</pubid><pubid idtype="pmpid" link="fulltext">19596244</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>The unfoldomics decade: an update on intrinsically disordered proteins.</p></title><aug><au><snm>Dunker</snm><fnm>AK</fnm></au><au><snm>Oldfield</snm><fnm>C</fnm></au><au><snm>Meng</snm><fnm>J</fnm></au><au><snm>Romero</snm><fnm>P</fnm></au><au><snm>Yang</snm><fnm>J</fnm></au><au><snm>Chen</snm><fnm>J</fnm></au><au><snm>Vacic</snm><fnm>V</fnm></au><au><snm>Obradovic</snm><fnm>Z</fnm></au><au><snm>Uversky</snm><fnm>V</fnm></au></aug><source>BMC Genomics</source><pubdate>2008</pubdate><volume>9</volume><issue>Suppl 2</issue><fpage>S1</fpage><xrefbib><pubid idtype="doi">10.1186/1471-2164-9-S2-S1</pubid></xrefbib></bibl><bibl id="B7"><title><p>Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms.</p></title><aug><au><snm>Romero</snm><fnm>PR</fnm></au><au><snm>Zaidi</snm><fnm>S</fnm></au><au><snm>Fang</snm><fnm>YY</fnm></au><au><snm>Uversky</snm><fnm>VN</fnm></au><au><snm>Radivojac</snm><fnm>P</fnm></au><au><snm>Oldfield</snm><fnm>CJ</fnm></au><au><snm>Cortese</snm><fnm>MS</fnm></au><au><snm>Sickmeier</snm><fnm>M</fnm></au><au><snm>LeGall</snm><fnm>T</fnm></au><au><snm>Obradovic</snm><fnm>Z</fnm></au><au><snm>Dunker</snm><fnm>AK</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2006</pubdate><volume>103</volume><fpage>8390</fpage><lpage>8395</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0507916103</pubid><pubid idtype="pmcid">1482503</pubid><pubid idtype="pmpid">16717195</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Intrinsic disorder and functional proteomics.</p></title><aug><au><snm>Radivojac</snm><fnm>P</fnm></au><au><snm>Iakoucheva</snm><fnm>LM</fnm></au><au><snm>Oldfield</snm><fnm>CJ</fnm></au><au><snm>Obradovic</snm><fnm>Z</fnm></au><au><snm>Uversky</snm><fnm>VN</fnm></au><au><snm>Dunker</snm><fnm>AK</fnm></au></aug><source>Biophys J</source><pubdate>2007</pubdate><volume>92</volume><fpage>1439</fpage><lpage>1456</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1529/biophysj.106.094045</pubid><pubid idtype="pmcid">1796814</pubid><pubid idtype="pmpid">17158572</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>The genetic landscape of a cell.</p></title><aug><au><snm>Costanzo</snm><fnm>M</fnm></au><au><snm>Baryshnikova</snm><fnm>A</fnm></au><au><snm>Bellay</snm><fnm>J</fnm></au><au><snm>Kim</snm><fnm>Y</fnm></au><au><snm>Spear</snm><fnm>ED</fnm></au><au><snm>Sevier</snm><fnm>CS</fnm></au><au><snm>Ding</snm><fnm>H</fnm></au><au><snm>Koh</snm><fnm>JL</fnm></au><au><snm>Toufighi</snm><fnm>K</fnm></au><au><snm>Mostafavi</snm><fnm>S</fnm></au><au><snm>Prinz</snm><fnm>J</fnm></au><au><snm>St Onge</snm><fnm>RP</fnm></au><au><snm>VanderSluis</snm><fnm>B</fnm></au><au><snm>Makhnevych</snm><fnm>T</fnm></au><au><snm>Vizeacoumar</snm><fnm>FJ</fnm></au><au><snm>Alizadeh</snm><fnm>S</fnm></au><au><snm>Bahr</snm><fnm>S</fnm></au><au><snm>Brost</snm><fnm>RL</fnm></au><au><snm>Chen</snm><fnm>Y</fnm></au><au><snm>Cokol</snm><fnm>M</fnm></au><au><snm>Deshpande</snm><fnm>R</fnm></au><au><snm>Li</snm><fnm>Z</fnm></au><au><snm>Lin</snm><fnm>Z</fnm></au><au><snm>Liang</snm><fnm>W</fnm></au><au><snm>Marback</snm><fnm>M</fnm></au><au><snm>Paw</snm><fnm>J</fnm></au><au><snm>San Luis</snm><fnm>B</fnm></au><au><snm>Shuteriqi</snm><fnm>E</fnm></au><au><snm>Tong</snm><fnm>AHY</fnm></au><au><snm>van Dyk</snm><fnm>N</fnm></au><etal/></aug><source>Science</source><pubdate>2010</pubdate><volume>327</volume><fpage>425</fpage><lpage>431</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1180823</pubid><pubid idtype="pmpid" link="fulltext">20093466</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Network hubs buffer environmental variation in <it>Saccharomyces cerevisiae</it>.</p></title><aug><au><snm>Levy</snm><fnm>SF</fnm></au><au><snm>Siegal</snm><fnm>ML</fnm></au></aug><source>PLoS Biol</source><pubdate>2008</pubdate><volume>6</volume><fpage>e264</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0060264</pubid><pubid idtype="pmpid">18986213</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>The chemical genomic portrait of yeast: uncovering a phenotype for all genes.</p></title><aug><au><snm>Hillenmeyer</snm><fnm>ME</fnm></au><au><snm>Fung</snm><fnm>E</fnm></au><au><snm>Wildenhain</snm><fnm>J</fnm></au><au><snm>Pierce</snm><fnm>SE</fnm></au><au><snm>Hoon</snm><fnm>S</fnm></au><au><snm>Lee</snm><fnm>W</fnm></au><au><snm>Proctor</snm><fnm>M</fnm></au><au><snm>St Onge</snm><fnm>RP</fnm></au><au><snm>Tyers</snm><fnm>M</fnm></au><au><snm>Koller</snm><fnm>D</fnm></au><au><snm>Altman</snm><fnm>RB</fnm></au><au><snm>Davis</snm><fnm>RW</fnm></au><au><snm>Nislow</snm><fnm>C</fnm></au><au><snm>Giaever</snm><fnm>G</fnm></au></aug><source>Science</source><pubdate>2008</pubdate><volume>320</volume><fpage>362</fpage><lpage>365</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1150021</pubid><pubid idtype="pmcid">2794835</pubid><pubid idtype="pmpid">18420932</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Integrated assessment of genomic correlates of protein evolutionary rate.</p></title><aug><au><snm>Xia</snm><fnm>Y</fnm></au><au><snm>Franzosa</snm><fnm>EA</fnm></au><au><snm>Gerstein</snm><fnm>MB</fnm></au></aug><source>PLoS Comput Biol</source><pubdate>2009</pubdate><volume>5</volume><fpage>e1000413</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pcbi.1000413</pubid><pubid idtype="pmcid">2688033</pubid><pubid idtype="pmpid">19521505</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions.</p></title><aug><au><snm>Chen</snm><fnm>JW</fnm></au><au><snm>Romero</snm><fnm>P</fnm></au><au><snm>Uversky</snm><fnm>VN</fnm></au><au><snm>Dunker</snm><fnm>AK</fnm></au></aug><source>J Proteome Res</source><pubdate>2006</pubdate><volume>5</volume><fpage>879</fpage><lpage>887</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/pr060048x</pubid><pubid idtype="pmcid">2543136</pubid><pubid idtype="pmpid">16602695</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>Conservation of intrinsic disorder in protein domains and families: II. Functions of conserved disorder.</p></title><aug><au><snm>Chen</snm><fnm>JW</fnm></au><au><snm>Romero</snm><fnm>P</fnm></au><au><snm>Uversky</snm><fnm>VN</fnm></au><au><snm>Dunker</snm><fnm>AK</fnm></au></aug><source>J Proteome Res</source><pubdate>2006</pubdate><volume>5</volume><fpage>888</fpage><lpage>898</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/pr060049p</pubid><pubid idtype="pmcid">2533134</pubid><pubid idtype="pmpid">16602696</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>Assessment of disorder predictions in CASP8.</p></title><aug><au><snm>Noivirt-Brik</snm><fnm>O</fnm></au><au><snm>Prilusky</snm><fnm>J</fnm></au><au><snm>Sussman</snm><fnm>JL</fnm></au></aug><source>Proteins</source><pubdate>2009</pubdate><volume>77</volume><fpage>210</fpage><lpage>216</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/prot.22586</pubid><pubid idtype="pmpid" link="fulltext">19774619</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>Protein disorder prediction: implications for structural proteomics.</p></title><aug><au><snm>Linding</snm><fnm>R</fnm></au><au><snm>Jensen</snm><fnm>LJ</fnm></au><au><snm>Diella</snm><fnm>F</fnm></au><au><snm>Bork</snm><fnm>P</fnm></au><au><snm>Gibson</snm><fnm>TJ</fnm></au><au><snm>Russell</snm><fnm>RB</fnm></au></aug><source>Structure</source><pubdate>2003</pubdate><volume>11</volume><fpage>1453</fpage><lpage>1459</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.str.2003.10.002</pubid><pubid idtype="pmpid" link="fulltext">14604535</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>Protein secondary structure appears to be robust under <it>in silico </it>evolution while protein disorder appears not to be.</p></title><aug><au><snm>Schaefer</snm><fnm>C</fnm></au><au><snm>Schlessinger</snm><fnm>A</fnm></au><au><snm>Rost</snm><fnm>B</fnm></au></aug><source>Bioinformatics</source><pubdate>2010</pubdate><volume>26</volume><fpage>625</fpage><lpage>631</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btq012</pubid><pubid idtype="pmcid">2828120</pubid><pubid idtype="pmpid">20081223</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>Evidence for dynamically organized modularity in the yeast protein-protein interaction network.</p></title><aug><au><snm>Han</snm><fnm>JJ</fnm></au><au><snm>Bertin</snm><fnm>N</fnm></au><au><snm>Hao</snm><fnm>T</fnm></au><au><snm>Goldberg</snm><fnm>DS</fnm></au><au><snm>Berriz</snm><fnm>GF</fnm></au><au><snm>Zhang</snm><fnm>LV</fnm></au><au><snm>Dupuy</snm><fnm>D</fnm></au><au><snm>Walhout</snm><fnm>AJM</fnm></au><au><snm>Cusick</snm><fnm>ME</fnm></au><au><snm>Roth</snm><fnm>FP</fnm></au><au><snm>Vidal</snm><fnm>M</fnm></au></aug><source>Nature</source><pubdate>2004</pubdate><volume>430</volume><fpage>88</fpage><lpage>93</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature02555</pubid><pubid idtype="pmpid" link="fulltext">15190252</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>The importance of intrinsic disorder for protein phosphorylation.</p></title><aug><au><snm>Iakoucheva</snm><fnm>LM</fnm></au><au><snm>Radivojac</snm><fnm>P</fnm></au><au><snm>Brown</snm><fnm>CJ</fnm></au><au><snm>O&apos;Connor</snm><fnm>TR</fnm></au><au><snm>Sikes</snm><fnm>JG</fnm></au><au><snm>Obradovic</snm><fnm>Z</fnm></au><au><snm>Dunker</snm><fnm>AK</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2004</pubdate><volume>32</volume><fpage>1037</fpage><lpage>1049</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkh253</pubid><pubid idtype="pmcid">373391</pubid><pubid idtype="pmpid">14960716</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Relating three-dimensional structures to protein networks provides evolutionary insights.</p></title><aug><au><snm>Kim</snm><fnm>PM</fnm></au><au><snm>Lu</snm><fnm>LJ</fnm></au><au><snm>Xia</snm><fnm>Y</fnm></au><au><snm>Gerstein</snm><fnm>MB</fnm></au></aug><source>Science</source><pubdate>2006</pubdate><volume>314</volume><fpage>1938</fpage><lpage>1941</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1136174</pubid><pubid idtype="pmpid" link="fulltext">17185604</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>A sliding docking interaction is essential for sequential and processive phosphorylation of an SR protein by SRPK1.</p></title><aug><au><snm>Ngo</snm><fnm>JCK</fnm></au><au><snm>Giang</snm><fnm>K</fnm></au><au><snm>Chakrabarti</snm><fnm>S</fnm></au><au><snm>Ma</snm><fnm>C</fnm></au><au><snm>Huynh</snm><fnm>N</fnm></au><au><snm>Hagopian</snm><fnm>JC</fnm></au><au><snm>Dorrestein</snm><fnm>PC</fnm></au><au><snm>Fu</snm><fnm>X</fnm></au><au><snm>Adams</snm><fnm>JA</fnm></au><au><snm>Ghosh</snm><fnm>G</fnm></au></aug><source>Mol Cell</source><pubdate>2008</pubdate><volume>29</volume><fpage>563</fpage><lpage>576</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.molcel.2007.12.017</pubid><pubid idtype="pmcid">2852395</pubid><pubid idtype="pmpid">18342604</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Activation of the Bur1-Bur2 cyclin-dependent kinase complex by Cak1.</p></title><aug><au><snm>Yao</snm><fnm>S</fnm></au><au><snm>Prelich</snm><fnm>G</fnm></au></aug><source>Mol Cell Biol</source><pubdate>2002</pubdate><volume>22</volume><fpage>6750</fpage><lpage>6758</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1128/MCB.22.19.6750-6758.2002</pubid><pubid idtype="pmcid">134033</pubid><pubid idtype="pmpid">12215532</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>Mutual induced fit binding of <it>Xenopus </it>ribosomal protein L5 to 5S rRNA.</p></title><aug><au><snm>DiNitto</snm><fnm>JP</fnm></au><au><snm>Huber</snm><fnm>PW</fnm></au></aug><source>J Mol Biol</source><pubdate>2003</pubdate><volume>330</volume><fpage>979</fpage><lpage>992</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0022-2836(03)00685-5</pubid><pubid idtype="pmpid" link="fulltext">12860121</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>The role of structural disorder in the function of RNA and protein chaperones.</p></title><aug><au><snm>Tompa</snm><fnm>P</fnm></au><au><snm>Csermely</snm><fnm>P</fnm></au></aug><source>FASEB J</source><pubdate>2004</pubdate><volume>18</volume><fpage>1169</fpage><lpage>1175</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1096/fj.04-1584rev</pubid><pubid idtype="pmpid" link="fulltext">15284216</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>Molecular simulations of protein disorder.</p></title><aug><au><snm>Rauscher</snm><fnm>S</fnm></au><au><snm>Pom&#232;s</snm><fnm>R</fnm></au></aug><source>Biochem Cell Biol</source><pubdate>2010</pubdate><volume>88</volume><fpage>269</fpage><lpage>290</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1139/O09-169</pubid><pubid idtype="pmpid" link="fulltext">20453929</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>The interplay between structure and function in intrinsically unstructured proteins.</p></title><aug><au><snm>Tompa</snm><fnm>P</fnm></au></aug><source>FEBS Lett</source><pubdate>2005</pubdate><volume>579</volume><fpage>3346</fpage><lpage>3354</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.febslet.2005.03.072</pubid><pubid idtype="pmpid" link="fulltext">15943980</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>Flavors of protein disorder.</p></title><aug><au><snm>Vucetic</snm><fnm>S</fnm></au><au><snm>Brown</snm><fnm>CJ</fnm></au><au><snm>Dunker</snm><fnm>AK</fnm></au><au><snm>Obradovic</snm><fnm>Z</fnm></au></aug><source>Proteins</source><pubdate>2003</pubdate><volume>52</volume><fpage>573</fpage><lpage>584</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/prot.10437</pubid><pubid idtype="pmpid" link="fulltext">12910457</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>Linking folding and binding.</p></title><aug><au><snm>Wright</snm><fnm>PE</fnm></au><au><snm>Dyson</snm><fnm>HJ</fnm></au></aug><source>Curr Opin Struct Biol</source><pubdate>2009</pubdate><volume>19</volume><fpage>31</fpage><lpage>38</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.sbi.2008.12.003</pubid><pubid idtype="pmcid">2675572</pubid><pubid idtype="pmpid">19157855</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>Evolution of characterized phosphorylation sites in budding yeast.</p></title><aug><au><snm>Nguyen Ba</snm><fnm>AN</fnm></au><au><snm>Moses</snm><fnm>AM</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2010</pubdate><volume>27</volume><fpage>2027</fpage><lpage>2037</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msq090</pubid><pubid idtype="pmpid" link="fulltext">20368267</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>Dissection of the ATP-induced conformational cycle of the molecular chaperone Hsp90.</p></title><aug><au><snm>Hessling</snm><fnm>M</fnm></au><au><snm>Richter</snm><fnm>K</fnm></au><au><snm>Buchner</snm><fnm>J</fnm></au></aug><source>Nat Struct Mol Biol</source><pubdate>2009</pubdate><volume>16</volume><fpage>287</fpage><lpage>293</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nsmb.1565</pubid><pubid idtype="pmpid" link="fulltext">19234467</pubid></pubidlist></xrefbib></bibl><bibl id="B31"><title><p>Hydrophilic residues 526 KNDAAD 531 in the flexible C-terminal region of the chaperonin GroEL are critical for substrate protein folding within the central cavity.</p></title><aug><au><snm>Machida</snm><fnm>K</fnm></au><au><snm>Kono-Okada</snm><fnm>A</fnm></au><au><snm>Hongo</snm><fnm>K</fnm></au><au><snm>Mizobata</snm><fnm>T</fnm></au><au><snm>Kawata</snm><fnm>Y</fnm></au></aug><source>J Biol Chem</source><pubdate>2008</pubdate><volume>283</volume><fpage>6886</fpage><lpage>6896</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1074/jbc.M708002200</pubid><pubid idtype="pmpid" link="fulltext">18184659</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>Intrinsically disordered chaperones in plants and animals.</p></title><aug><au><snm>Tompa</snm><fnm>P</fnm></au><au><snm>Kovacs</snm><fnm>D</fnm></au></aug><source>Biochem Cell Biol</source><pubdate>2010</pubdate><volume>88</volume><fpage>167</fpage><lpage>174</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1139/O09-163</pubid><pubid idtype="pmpid" link="fulltext">20453919</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><aug><au><snm>Casella</snm><fnm>G</fnm></au><au><snm>Berger</snm><fnm>RL</fnm></au></aug><source>Statistical Inference</source><publisher>Thomson Learning</publisher><pubdate>2002</pubdate></bibl><bibl id="B34"><title><p>MUSCLE: a multiple sequence alignment method with reduced time and space complexity.</p></title><aug><au><snm>Edgar</snm><fnm>R</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2004</pubdate><volume>5</volume><fpage>113</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-5-113</pubid><pubid idtype="pmcid">517706</pubid><pubid idtype="pmpid">15318951</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>PAML 4: Phylogenetic Analysis by Maximum Likelihood.</p></title><aug><au><snm>Yang</snm><fnm>Z</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2007</pubdate><volume>24</volume><fpage>1586</fpage><lpage>1591</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msm088</pubid><pubid idtype="pmpid" link="fulltext">17483113</pubid></pubidlist></xrefbib></bibl><bibl id="B36"><title><p>Dissecting the regulatory circuitry of a eukaryotic genome.</p></title><aug><au><snm>Holstege</snm><fnm>FC</fnm></au><au><snm>Jennings</snm><fnm>EG</fnm></au><au><snm>Wyrick</snm><fnm>JJ</fnm></au><au><snm>Lee</snm><fnm>TI</fnm></au><au><snm>Hengartner</snm><fnm>CJ</fnm></au><au><snm>Green</snm><fnm>MR</fnm></au><au><snm>Golub</snm><fnm>TR</fnm></au><au><snm>Lander</snm><fnm>ES</fnm></au><au><snm>Young</snm><fnm>RA</fnm></au></aug><source>Cell</source><pubdate>1998</pubdate><volume>95</volume><fpage>717</fpage><lpage>728</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0092-8674(00)81641-4</pubid><pubid idtype="pmpid" link="fulltext">9845373</pubid></pubidlist></xrefbib></bibl><bibl id="B37"><title><p>Precision and functional specificity in mRNA decay.</p></title><aug><au><snm>Wang</snm><fnm>Y</fnm></au><au><snm>Liu</snm><fnm>CL</fnm></au><au><snm>Storey</snm><fnm>JD</fnm></au><au><snm>Tibshirani</snm><fnm>RJ</fnm></au><au><snm>Herschlag</snm><fnm>D</fnm></au><au><snm>Brown</snm><fnm>PO</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2002</pubdate><volume>99</volume><fpage>5860</fpage><lpage>5865</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.092538799</pubid><pubid idtype="pmcid">122867</pubid><pubid idtype="pmpid">11972065</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>Network hubs buffer environmental variation in <it>Saccharomyces cerevisiae</it>.</p></title><aug><au><snm>Levy</snm><fnm>SF</fnm></au><au><snm>Siegal</snm><fnm>ML</fnm></au></aug><source>PLoS Biol</source><pubdate>2008</pubdate><volume>6</volume><fpage>e264</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0060264</pubid><pubid idtype="pmcid">2577700</pubid><pubid idtype="pmpid" link="fulltext">18986213</pubid></pubidlist></xrefbib></bibl><bibl id="B39"><title><p>Finding function: evaluation methods for functional genomic data.</p></title><aug><au><snm>Myers</snm><fnm>C</fnm></au><au><snm>Barrett</snm><fnm>D</fnm></au><au><snm>Hibbs</snm><fnm>M</fnm></au><au><snm>Huttenhower</snm><fnm>C</fnm></au><au><snm>Troyanskaya</snm><fnm>O</fnm></au></aug><source>BMC Genomics</source><pubdate>2006</pubdate><volume>7</volume><fpage>187</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-7-187</pubid><pubid idtype="pmcid">1560386</pubid><pubid idtype="pmpid">16869964</pubid></pubidlist></xrefbib></bibl><bibl id="B40"><title><p>A scalable method for integration and functional analysis of multiple microarray datasets.</p></title><aug><au><snm>Huttenhower</snm><fnm>C</fnm></au><au><snm>Hibbs</snm><fnm>M</fnm></au><au><snm>Myers</snm><fnm>C</fnm></au><au><snm>Troyanskaya</snm><fnm>OG</fnm></au></aug><source>Bioinformatics</source><pubdate>2006</pubdate><volume>22</volume><fpage>2890</fpage><lpage>2897</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btl492</pubid><pubid idtype="pmpid" link="fulltext">17005538</pubid></pubidlist></xrefbib></bibl><bibl id="B41"><title><p>Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs.</p></title><aug><au><snm>Obenauer</snm><fnm>JC</fnm></au><au><snm>Cantley</snm><fnm>LC</fnm></au><au><snm>Yaffe</snm><fnm>MB</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2003</pubdate><volume>31</volume><fpage>3635</fpage><lpage>3641</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkg584</pubid><pubid idtype="pmcid">168990</pubid><pubid idtype="pmpid">12824383</pubid></pubidlist></xrefbib></bibl><bibl id="B42"><title><p>Orthogroup Repository Documentation: The Synergy Algorithm</p></title><url>http://www.broadinstitute.org/regev/orthogroups/documentation.html#synergy</url></bibl><bibl id="B43"><title><p>MAFFT version 5: improvement in accuracy of multiple sequence alignment.</p></title><aug><au><snm>Katoh</snm><fnm>K</fnm></au><au><snm>Kuma</snm><fnm>K</fnm></au><au><snm>Toh</snm><fnm>H</fnm></au><au><snm>Miyata</snm><fnm>T</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2005</pubdate><volume>33</volume><fpage>511</fpage><lpage>518</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gki198</pubid><pubid idtype="pmcid">548345</pubid><pubid idtype="pmpid">15661851</pubid></pubidlist></xrefbib></bibl><bibl id="B44"><title><p>Jalview Version 2 - a multiple sequence alignment editor and analysis workbench.</p></title><aug><au><snm>Waterhouse</snm><fnm>AM</fnm></au><au><snm>Procter</snm><fnm>JB</fnm></au><au><snm>Martin</snm><fnm>DMA</fnm></au><au><snm>Clamp</snm><fnm>M</fnm></au><au><snm>Barton</snm><fnm>GJ</fnm></au></aug><source>Bioinformatics</source><pubdate>2009</pubdate><volume>25</volume><fpage>1189</fpage><lpage>1191</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btp033</pubid><pubid idtype="pmcid">2672624</pubid><pubid idtype="pmpid">19151095</pubid></pubidlist></xrefbib></bibl><bibl id="B45"><title><p>Alignments of an ortholog group overlaid with Disopred2 prediction</p></title><url>http://kimlab1.ccbr.utoronto.ca/cgi-bin/disoQuery.pl</url></bibl><bibl id="B46"><title><p>Evolution of phosphoregulation: comparison of phosphorylation patterns across yeast species.</p></title><aug><au><snm>Beltrao</snm><fnm>P</fnm></au><au><snm>Trinidad</snm><fnm>JC</fnm></au><au><snm>Fiedler</snm><fnm>D</fnm></au><au><snm>Roguev</snm><fnm>A</fnm></au><au><snm>Lim</snm><fnm>WA</fnm></au><au><snm>Shokat</snm><fnm>KM</fnm></au><au><snm>Burlingame</snm><fnm>AL</fnm></au><au><snm>Krogan</snm><fnm>NJ</fnm></au></aug><source>PLoS Biol</source><pubdate>2009</pubdate><volume>7</volume><fpage>e1000134</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.1000134</pubid><pubid idtype="pmcid">2691599</pubid><pubid idtype="pmpid">19547744</pubid></pubidlist></xrefbib></bibl><bibl id="B47"><title><p>Quantitative phosphoproteomics applied to the yeast pheromone signaling pathway.</p></title><aug><au><snm>Gruhler</snm><fnm>A</fnm></au><au><snm>Olsen</snm><fnm>JV</fnm></au><au><snm>Mohammed</snm><fnm>S</fnm></au><au><snm>Mortensen</snm><fnm>P</fnm></au><au><snm>F&#230;rgeman</snm><fnm>NJ</fnm></au><au><snm>Mann</snm><fnm>M</fnm></au><au><snm>Jensen</snm><fnm>ON</fnm></au></aug><source>Mol Cell Proteomics</source><pubdate>2005</pubdate><volume>4</volume><fpage>310</fpage><lpage>327</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1074/mcp.M400219-MCP200</pubid><pubid idtype="pmpid" link="fulltext">15665377</pubid></pubidlist></xrefbib></bibl><bibl id="B48"><title><p>Large-scale phosphorylation analysis of &#945;-Factor-arrested <it>Saccharomyces cerevisiae</it>.</p></title><aug><au><snm>Li</snm><fnm>X</fnm></au><au><snm>Gerber</snm><fnm>SA</fnm></au><au><snm>Rudner</snm><fnm>AD</fnm></au><au><snm>Beausoleil</snm><fnm>SA</fnm></au><au><snm>Haas</snm><fnm>W</fnm></au><au><snm>Vill&#233;n</snm><fnm>J</fnm></au><au><snm>Elias</snm><fnm>JE</fnm></au><au><snm>Gygi</snm><fnm>SP</fnm></au></aug><source>J Proteome Res</source><pubdate>2007</pubdate><volume>6</volume><fpage>1190</fpage><lpage>1197</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1021/pr060559j</pubid><pubid idtype="pmpid" link="fulltext">17330950</pubid></pubidlist></xrefbib></bibl><bibl id="B49"><title><p>Analysis of phosphorylation sites on proteins from <it>Saccharomyces cerevisiae </it>by electron transfer dissociation (ETD) mass spectrometry.</p></title><aug><au><snm>Chi</snm><fnm>A</fnm></au><au><snm>Huttenhower</snm><fnm>C</fnm></au><au><snm>Geer</snm><fnm>LY</fnm></au><au><snm>Coon</snm><fnm>JJ</fnm></au><au><snm>Syka</snm><fnm>JEP</fnm></au><au><snm>Bai</snm><fnm>DL</fnm></au><au><snm>Shabanowitz</snm><fnm>J</fnm></au><au><snm>Burke</snm><fnm>DJ</fnm></au><au><snm>Troyanskaya</snm><fnm>OG</fnm></au><au><snm>Hunt</snm><fnm>DF</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2007</pubdate><volume>104</volume><fpage>2193</fpage><lpage>2198</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0607084104</pubid><pubid idtype="pmcid">1892997</pubid><pubid idtype="pmpid">17287358</pubid></pubidlist></xrefbib></bibl><bibl id="B50"><title><p>A multidimensional chromatography technology for in-depth phosphoproteome analysis.</p></title><aug><au><snm>Albuquerque</snm><fnm>CP</fnm></au><au><snm>Smolka</snm><fnm>MB</fnm></au><au><snm>Payne</snm><fnm>SH</fnm></au><au><snm>Bafna</snm><fnm>V</fnm></au><au><snm>Eng</snm><fnm>J</fnm></au><au><snm>Zhou</snm><fnm>H</fnm></au></aug><source>Mol Cell Proteomics</source><pubdate>2008</pubdate><volume>7</volume><fpage>1389</fpage><lpage>1396</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1074/mcp.M700468-MCP200</pubid><pubid idtype="pmcid">2493382</pubid><pubid idtype="pmpid">18407956</pubid></pubidlist></xrefbib></bibl><bibl id="B51"><title><p>Phosphoproteome analysis by mass spectrometry and its application to <it>Saccharomyces cerevisiae</it>.</p></title><aug><au><snm>Ficarro</snm><fnm>SB</fnm></au><au><snm>McCleland</snm><fnm>ML</fnm></au><au><snm>Stukenberg</snm><fnm>PT</fnm></au><au><snm>Burke</snm><fnm>DJ</fnm></au><au><snm>Ross</snm><fnm>MM</fnm></au><au><snm>Shabanowitz</snm><fnm>J</fnm></au><au><snm>Hunt</snm><fnm>DF</fnm></au><au><snm>White</snm><fnm>FM</fnm></au></aug><source>Nat Biotechnol</source><pubdate>2002</pubdate><volume>20</volume><fpage>301</fpage><lpage>305</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nbt0302-301</pubid><pubid idtype="pmpid" link="fulltext">11875433</pubid></pubidlist></xrefbib></bibl><bibl id="B52"><title><p>R: A Language and Environment for Statistical Computing</p></title><aug><au><cnm>Team RDC</cnm></au></aug><url>http://www.r-project.org/</url></bibl><bibl id="B53"><title><p>Partial Correlation</p></title><url>http://www.yilab.gatech.edu/pcor.html</url></bibl><bibl id="B54"><title><p><it>i</it>Pfam: the Protein Domain Interactions Database</p></title><url>http://ipfam.sanger.ac.uk/</url></bibl><bibl id="B55"><title><p>Protein Data Bank</p></title><url>http://www.pdb.org</url></bibl><bibl id="B56"><title><p>BioGRID</p></title><url>http://www.thebiogrid.org</url></bibl><bibl id="B57"><title><p>Cytoscape: a software environment for integrated models of biomolecular interaction networks.</p></title><aug><au><snm>Shannon</snm><fnm>P</fnm></au><au><snm>Markiel</snm><fnm>A</fnm></au><au><snm>Ozier</snm><fnm>O</fnm></au><au><snm>Baliga</snm><fnm>NS</fnm></au><au><snm>Wang</snm><fnm>JT</fnm></au><au><snm>Ramage</snm><fnm>D</fnm></au><au><snm>Amin</snm><fnm>N</fnm></au><au><snm>Schwikowski</snm><fnm>B</fnm></au><au><snm>Ideker</snm><fnm>T</fnm></au></aug><source>Genome Res</source><pubdate>2003</pubdate><volume>13</volume><fpage>2498</fpage><lpage>2504</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.1239303</pubid><pubid idtype="pmcid">403769</pubid><pubid idtype="pmpid">14597658</pubid></pubidlist></xrefbib></bibl><bibl id="B58"><title><p>Pathway analysis of dilated cardiomyopathy using global proteomic profiling and enrichment maps.</p></title><aug><au><snm>Isserlin</snm><fnm>R</fnm></au><au><snm>Merico</snm><fnm>D</fnm></au><au><snm>Alikhani-Koupaei</snm><fnm>R</fnm></au><au><snm>Gramolini</snm><fnm>A</fnm></au><au><snm>Bader</snm><fnm>GD</fnm></au><au><snm>Emili</snm><fnm>A</fnm></au></aug><source>Proteomics</source><pubdate>2010</pubdate><volume>10</volume><fpage>1316</fpage><lpage>1327</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/pmic.200900412</pubid><pubid idtype="pmcid">2879143,2879143</pubid><pubid idtype="pmpid">20127684</pubid></pubidlist></xrefbib></bibl><bibl id="B59"><title><p>Proteome survey reveals modularity of the yeast cell machinery.</p></title><aug><au><snm>Gavin</snm><fnm>A</fnm></au><au><snm>Aloy</snm><fnm>P</fnm></au><au><snm>Grandi</snm><fnm>P</fnm></au><au><snm>Krause</snm><fnm>R</fnm></au><au><snm>Boesche</snm><fnm>M</fnm></au><au><snm>Marzioch</snm><fnm>M</fnm></au><au><snm>Rau</snm><fnm>C</fnm></au><au><snm>Jensen</snm><fnm>LJ</fnm></au><au><snm>Bastuck</snm><fnm>S</fnm></au><au><snm>Dumpelfeld</snm><fnm>B</fnm></au><au><snm>Edelmann</snm><fnm>A</fnm></au><au><snm>Heurtier</snm><fnm>M</fnm></au><au><snm>Hoffman</snm><fnm>V</fnm></au><au><snm>Hoefert</snm><fnm>C</fnm></au><au><snm>Klein</snm><fnm>K</fnm></au><au><snm>Hudak</snm><fnm>M</fnm></au><au><snm>Michon</snm><fnm>A</fnm></au><au><snm>Schelder</snm><fnm>M</fnm></au><au><snm>Schirle</snm><fnm>M</fnm></au><au><snm>Remor</snm><fnm>M</fnm></au><au><snm>Rudi</snm><fnm>T</fnm></au><au><snm>Hooper</snm><fnm>S</fnm></au><au><snm>Bauer</snm><fnm>A</fnm></au><au><snm>Bouwmeester</snm><fnm>T</fnm></au><au><snm>Casari</snm><fnm>G</fnm></au><au><snm>Drewes</snm><fnm>G</fnm></au><au><snm>Neubauer</snm><fnm>G</fnm></au><au><snm>Rick</snm><fnm>JM</fnm></au><au><snm>Kuster</snm><fnm>B</fnm></au><au><snm>Bork</snm><fnm>P</fnm></au><etal/></aug><source>Nature</source><pubdate>2006</pubdate><volume>440</volume><fpage>631</fpage><lpage>636</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature04532</pubid><pubid idtype="pmpid" link="fulltext">16429126</pubid></pubidlist></xrefbib></bibl><bibl id="B60"><title><p>Global landscape of protein complexes in the yeast <it>Saccharomyces cerevisiae</it>.</p></title><aug><au><snm>Krogan</snm><fnm>NJ</fnm></au><au><snm>Cagney</snm><fnm>G</fnm></au><au><snm>Yu</snm><fnm>H</fnm></au><au><snm>Zhong</snm><fnm>G</fnm></au><au><snm>Guo</snm><fnm>X</fnm></au><au><snm>Ignatchenko</snm><fnm>A</fnm></au><au><snm>Li</snm><fnm>J</fnm></au><au><snm>Pu</snm><fnm>S</fnm></au><au><snm>Datta</snm><fnm>N</fnm></au><au><snm>Tikuisis</snm><fnm>AP</fnm></au><au><snm>Punna</snm><fnm>T</fnm></au><au><snm>Peregr&#237;n-Alvarez</snm><fnm>JM</fnm></au><au><snm>Shales</snm><fnm>M</fnm></au><au><snm>Zhang</snm><fnm>X</fnm></au><au><snm>Davey</snm><fnm>M</fnm></au><au><snm>Robinson</snm><fnm>MD</fnm></au><au><snm>Paccanaro</snm><fnm>A</fnm></au><au><snm>Bray</snm><fnm>JE</fnm></au><au><snm>Sheung</snm><fnm>A</fnm></au><au><snm>Beattie</snm><fnm>B</fnm></au><au><snm>Richards</snm><fnm>DP</fnm></au><au><snm>Canadien</snm><fnm>V</fnm></au><au><snm>Lalev</snm><fnm>A</fnm></au><au><snm>Mena</snm><fnm>F</fnm></au><au><snm>Wong</snm><fnm>P</fnm></au><au><snm>Starostine</snm><fnm>A</fnm></au><au><snm>Canete</snm><fnm>MM</fnm></au><au><snm>Vlasblom</snm><fnm>J</fnm></au><au><snm>Wu</snm><fnm>S</fnm></au><au><snm>Orsi</snm><fnm>C</fnm></au><etal/></aug><source>Nature</source><pubdate>2006</pubdate><volume>440</volume><fpage>637</fpage><lpage>643</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature04670</pubid><pubid idtype="pmpid" link="fulltext">16554755</pubid></pubidlist></xrefbib></bibl><bibl id="B61"><title><p>High-quality binary protein interaction map of the yeast interactome network.</p></title><aug><au><snm>Yu</snm><fnm>H</fnm></au><au><snm>Braun</snm><fnm>P</fnm></au><au><snm>Yildirim</snm><fnm>MA</fnm></au><au><snm>Lemmens</snm><fnm>I</fnm></au><au><snm>Venkatesan</snm><fnm>K</fnm></au><au><snm>Sahalie</snm><fnm>J</fnm></au><au><snm>Hirozane-Kishikawa</snm><fnm>T</fnm></au><au><snm>Gebreab</snm><fnm>F</fnm></au><au><snm>Li</snm><fnm>N</fnm></au><au><snm>Simonis</snm><fnm>N</fnm></au><au><snm>Hao</snm><fnm>T</fnm></au><au><snm>Rual</snm><fnm>J</fnm></au><au><snm>Dricot</snm><fnm>A</fnm></au><au><snm>Vazquez</snm><fnm>A</fnm></au><au><snm>Murray</snm><fnm>RR</fnm></au><au><snm>Simon</snm><fnm>C</fnm></au><au><snm>Tardivo</snm><fnm>L</fnm></au><au><snm>Tam</snm><fnm>S</fnm></au><au><snm>Svrzikapa</snm><fnm>N</fnm></au><au><snm>Fan</snm><fnm>C</fnm></au><au><snm>de Smet</snm><fnm>A</fnm></au><au><snm>Motyl</snm><fnm>A</fnm></au><au><snm>Hudson</snm><fnm>ME</fnm></au><au><snm>Park</snm><fnm>J</fnm></au><au><snm>Xin</snm><fnm>X</fnm></au><au><snm>Cusick</snm><fnm>ME</fnm></au><au><snm>Moore</snm><fnm>T</fnm></au><au><snm>Boone</snm><fnm>C</fnm></au><au><snm>Snyder</snm><fnm>M</fnm></au><au><snm>Roth</snm><fnm>FP</fnm></au><etal/></aug><source>Science</source><pubdate>2008</pubdate><volume>322</volume><fpage>104</fpage><lpage>110</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1158684</pubid><pubid idtype="pmcid">2746753</pubid><pubid idtype="pmpid">18719252</pubid></pubidlist></xrefbib></bibl><bibl id="B62"><title><p>The role of disorder in interaction networks: a structural analysis.</p></title><aug><au><snm>Kim</snm><fnm>PM</fnm></au><au><snm>Sboner</snm><fnm>A</fnm></au><au><snm>Xia</snm><fnm>Y</fnm></au><au><snm>Gerstein</snm><fnm>M</fnm></au></aug><source>Mol Syst Biol</source><pubdate>2008</pubdate><volume>4</volume><fpage>179</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/msb.2008.16</pubid><pubid idtype="pmcid">2290937</pubid><pubid idtype="pmpid">18364713</pubid></pubidlist></xrefbib></bibl></refgrp>
</bm></art>