<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2005-6-6-r55</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Software</dochead>
		<bibl>
			<title>
				<p>Refinement and prediction of protein prenylation motifs</p>
			</title>
			<aug>
				<au id="A1" ca="yes">
					<snm>Maurer-Stroh</snm>
					<fnm>Sebastian</fnm>
					<insr iid="I1"/>
					<email>stroh@imp.univie.ac.at</email>
				</au>
				<au id="A2">
					<snm>Eisenhaber</snm>
					<fnm>Frank</fnm>
					<insr iid="I1"/>
					<email>Frank.Eisenhaber@imp.univie.ac.at</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>IMP - Research Institute of Molecular Pathology, Dr. Bohr-Gasse 7, A-1030 Vienna, Austria</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2005</pubdate>
			<volume>6</volume>
			<issue>6</issue>
			<fpage>R55</fpage>
			<url>http://genomebiology.com/2005/6/6/R55</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">15960807</pubid><pubid idtype="doi">10.1186/gb-2005-6-6-r55</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>17</day>
					<month>1</month>
					<year>2005</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>22</day>
					<month>3</month>
					<year>2005</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>20</day>
					<month>4</month>
					<year>2005</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>27</day>
					<month>5</month>
					<year>2005</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2005</year>
			<collab>Maurer-Stroh and Eisenhaber; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<shorttitle>
			<p>Refinement and prediction of protein prenylation motifs</p>
		</shorttitle>
		<shortabs>
			<p>Three prenylation motif predictors are presented that allow discrimination between proteins that are unique substrates of farnesyltransferase (FT) and those that can be alternatively processed by geranylgeranyltransferase I (GGT1).</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<p>We refined the motifs for carboxy-terminal protein prenylation by analysis of known substrates for farnesyltransferase (FT), geranylgeranyltransferase I (GGT1) and geranylgeranyltransferase II (GGT2). In addition to the CaaX box for the first two enzymes, we identify a preceding linker region that appears constrained in physicochemical properties, requiring small or flexible, preferably hydrophilic, amino acids. Predictors were constructed on the basis of sequence and physical property profiles, including interpositional correlations, and are available as the Prenylation Prediction Suite (PrePS, <url>http://mendel.imp.univie.ac.at/sat/PrePS</url>) which also allows evaluation of evolutionary motif conservation. PrePS can predict partially overlapping substrate specificities, which is of medical importance in the case of understanding cellular action of FT inhibitors as anticancer and anti-parasite agents.</p>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Rationale</p>
			</st>
			<p>Prenylation refers to the posttranslational modification of proteins with isoprenyl anchors <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. These lipid moieties are typically involved in mediating not only protein-membrane but also protein-protein interactions. Three eukaryotic enzymes are known to catalyze the lipid transfer. The first two, farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1), recognize the so-called CaaX box in the carboxy termini of substrate proteins and attach farnesyl (15-carbon polyisoprene) or geranylgeranyl (20-carbon polyisoprene), respectively, to a required and spatially fixed cysteine in that motif. The third enzyme, geranylgeranyltransferase 2 (GGT2 or RabGGT) recognizes the complex <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> of Rab GTPase substrate proteins with a specific Rab escort protein (REP) to attach one or two geranylgeranyl anchors to cysteines in a more flexible but also carboxy-terminal motif.</p>
			<p>The CaaX box was initially understood to consist of a cysteine (C), followed by two aliphatic residues (aa) and a terminal residue (X) that would direct modification by either FT or GGT1, but newly found substrates and kinetic studies of mutated substrate peptides and enzyme inhibitors have shown that the motif recognized by the enzymes appears to be more flexible <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Furthermore, the determination of preference for FT or GGT1 is more complex and a function of the overall sequence context rather than specific amino acids at single positions. Whereas GGT2 appears to be specific to Rab GTPases as substrates, the recognition mechanism is not well understood. Overlapping substrate specificities between all three prenylating enzymes further complicate the understanding of the lipid modification process <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>.</p>
			<p>An unsolved problem so far is accounting for the complexity of the prenylation substrate recognition motifs in theoretical models in order to identify substrate proteins from their amino-acid sequence. No available method has been able to selectively assign the correct modifying enzyme, which determines the types and number of lipid anchors. The high probability of motifs similar to the small CaaX box occurring by chance is a general problem that has so far prohibited large-scale proteome analyses <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. We describe here a method that aims to model the substrate-enzyme interactions on the basis of refinement of the recognition motifs for each of the prenyltransferases. The Prenylation Prediction Suite (PrePS) selectively assigns the modifying enzyme to predicted substrate proteins and sensitively filters out false-positive predictions based on the general methodology that has already been applied successfully for the prediction of glycosylphosphatidylinositol (GPI) anchors <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, myristoylation <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> and PTS1 peroxisomal targeting <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>.</p>
		</sec>
		<sec>
			<st>
				<p>Known substrates and their motif-compliant homologs as learning sets</p>
			</st>
			<p>The first task consists of collecting sequences that are known substrates for the respective enzymes. Typically, a good starting point is the Swiss-Prot database <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. However, according to earlier experience with annotation inaccuracies <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, any annotated experimental evidence has to be confirmed by following up all the related literature sources. As newly available data can be missing in the Swiss-Prot annotation, the searches have also to be extended to non-Swiss-Prot proteins. In most cases, the annotations for prenylation in Swiss-Prot are assigned by similarity to only a few entries with experimental validation. A major concern is the annotation of the correct anchor type attached to FT and GGT1 substrates, which could previously only tentatively be estimated without experimental data. This includes several entries with overall sequence similarity to a verified prenylated protein but totally different carboxy-terminal motifs. Given that single mutations can abolish recognition or switch enzyme specificities <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> and that not all homologs of lipid-modified proteins necessarily have to share the same modification type or membrane attachment factor (MAF) <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, entries with annotations only by similarity should not be included without critical consideration in a learning set.</p>
			<p>Unfortunately, such justified concerns dramatically lower the amount of data in the learning set. However, because of earlier interest in developing peptide-based inhibitors of FT and GGT1 as anticancer treatments, the kinetics of the enzymes with various tetrapeptide substrates already modified with lipid anchors by the enzymes have been measured <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Hence, a protein homologous to a verified prenylated protein can be included in the learning set if its CaaX box has already been shown to interact productively with one of the prenyltransferases at least as a tetrapeptide.</p>
			<p>However, possession of a valid CaaX box might not be a sufficient selection criterion. Typically, short terminal sequence motifs are connected to the rest of the protein by a linker region that experiences only limited constraints on specific amino acids per position but often has a compositional bias towards small or hydrophilic amino acids in connecting sequence stretches <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. This property is found in a preliminary assembly of verified FT and GGT1 substrates and has been confirmed in the actual learning set for up to 11 residues upstream (amino-terminal) of the cysteine in the CaaX box (see below). Hence, learning-set sequences should also not violate the physicochemical properties constraining the sequence stretch amino-terminal to the CaaX box.</p>
			<p>Taking account of the considerations above, the following procedure has been applied to obtain conservative and reliable learning sets of FT and GGT1 substrates. First, a literature search for known prenylated proteins and valid tetrapeptides (see <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>). Second, BLASTP <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> with an E-value threshold of 0.005 starting with known prenylated proteins against the National Center for Biotechnology Information (NCBI) nonredundant database to find homologs and cluster all collected sequences into groups of homologous proteins using the Markov-chain clustering algorithm (MCL) <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Third, check the validity of all CaaX boxes with experimental evidence for at least tetrapeptides. Fourth, check compliance with the physical properties of the full motif (including linker) by applying a preliminary predictor based on corrected Swiss-Prot entries in a similar style as described here (penalizing deviations from the physical property landscape of the motif).</p>
			<p>This resulted in learning sets of 692 FT and 486 GGT1 substrates, respectively (see <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>). Among the FT substrates, 31 artificial constructs or mutations of naturally occurring sequences that have been shown to be processed by FT have also been included. Prenylation by GGT2 follows totally different mechanistic requirements than FT and GGT1 and will be treated separately after the sections about CaaX prenylation.</p>
		</sec>
		<sec>
			<st>
				<p>Refinement of the CaaX box motif descriptions</p>
			</st>
			<p>Compositional analysis of residue frequencies at single motif positions reveals that major restrictions to specific amino acid types exist only for positions within the CaaX box (see sequence logos in Figure <figr fid="F1">1</figr>). The previously reported preferences for aliphatic residues at positions +1 and +2 (the aa in CaaX) were recovered, but there is a clear tendency for other residue types to also be allowed, especially at position +1 (the first a in CaaX). Correlation analysis of residue frequencies at single motif positions with amino-acid property scales <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp> can quantify the conservation of a physical property pattern (see Materials and methods). Although correlations higher than 0.6 can only be obtained for aliphatic property at position +2 (FT: 0.85, GGT1: 0.87), the average aliphatic property at position +1 within both FT and GGT1 learning sets still appears elevated when compared to an average calculated from the carboxy-termini of the nonredundant UniRef50 database <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> (see physical property profile in Figure <figr fid="F1">1</figr>). Similarly, there are correlations at position +2 and deviations from the UniRef50 average at position +1 for a property describing preference for extended conformations (see Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>). This appears to be best explained by the need to have the final peptide part in extended conformation rather than coiled or helical in order to fit into the binding pocket, as can be seen in the resolved structures of prenyltransferases with their substrate peptides <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>.</p>
			<fig id="F1">
				<title>
					<p>Figure 1</p>
				</title>
				<caption>
					<p>Sequence logos [74] and physicochemical property profiles of FT and GGT1 substrates</p>
				</caption>
				<text>
					<p>Sequence logos [74] and physicochemical property profiles of FT and GGT1 substrates. Selected physical properties (hydrophilicity = KRIW790102; flexibility = KARP850103, size = CHOC760101; aliphatic = ZVEL_ALI_1; see Tables 1 and 2 for details) are calculated as average over the nonredundant learning sets of FT and GGT1. The plotted lines correspond to the relative deviation of the respective properties from an average calculated over carboxy termini from the UniRef50 database [22].</p>
				</text>
				<graphic file="gb-2005-6-6-r55-1"/>
			</fig>
			<tbl id="T1">
				<title>
					<p>Table 1</p>
				</title>
				<caption>
					<p>Physical property terms in the FT scoring function</p>
				</caption>
				<tblbdy cols="4">
					<r>
						<c ca="left">
							<p>Property</p>
						</c>
						<c ca="left">
							<p>Position</p>
						</c>
						<c ca="left">
							<p>Rationale</p>
						</c>
						<c ca="left">
							<p>Explanation</p>
						</c>
					</r>
					<r>
						<c cspan="4">
							<hr/>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ARGP820103 [62]</p>
						</c>
						<c ca="left">
							<p>+3</p>
						</c>
						<c ca="left">
							<p>Corr = 0.7(nrLS)</p>
						</c>
						<c ca="left">
							<p>Membrane-buried preference, lipid contact when entering binding pocket</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>logPREN_CKQX_FT [15]</p>
						</c>
						<c ca="left">
							<p>+3</p>
						</c>
						<c ca="left">
							<p>Corr = -0.72(nrLS)</p>
						</c>
						<c ca="left">
							<p>Kinetic measurement, relative unprocessed FPP amounts with tetrapeptide CKQX</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>CHOC760101 [63]</p>
						</c>
						<c ca="left">
							<p>+1 to +3</p>
						</c>
						<c ca="left">
							<p>Fisher = 1.3</p>
						</c>
						<c ca="left">
							<p>Side chain volume</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ZVEL_CHARG [64]</p>
						</c>
						<c ca="left">
							<p>+1 to +3</p>
						</c>
						<c ca="left">
							<p>LS composition</p>
						</c>
						<c ca="left">
							<p>General charge penalty</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ZVEL_CHNEG [64]</p>
						</c>
						<c ca="left">
							<p>+1 to +3</p>
						</c>
						<c ca="left">
							<p>LS composition</p>
						</c>
						<c ca="left">
							<p>Special negative charge penalty</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>WERD780102 [65]</p>
						</c>
						<c ca="left">
							<p>+1 and +3</p>
						</c>
						<c ca="left">
							<p>Fisher = 1.51</p>
						</c>
						<c ca="left">
							<p>Hydrophobicity compensation for inside preference</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ZVEL_ALI_1 [64]</p>
						</c>
						<c ca="left">
							<p>+1 and +2</p>
						</c>
						<c ca="left">
							<p>+2: Corr = 0.85(prof)</p>
							<p>+1: continuing deviation from Uniref50 average</p>
						</c>
						<c ca="left">
							<p>Amino-acid property: aliphatic</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>LIFS790102 [66]</p>
						</c>
						<c ca="left">
							<p>+1 and +2</p>
						</c>
						<c ca="left">
							<p>+2: Correlation = 0.76(prof)</p>
							<p>+1: continuing deviation from Uniref50 average</p>
						</c>
						<c ca="left">
							<p>Preference for extended conformations</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ZVEL_TINY_ [64]</p>
						</c>
						<c ca="left">
							<p>-1</p>
						</c>
						<c ca="left">
							<p>Corr = 0.68(prof)</p>
						</c>
						<c ca="left">
							<p>Size, bulkiness</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>MOBILITY_2 [21]</p>
						</c>
						<c ca="left">
							<p>-1</p>
						</c>
						<c ca="left">
							<p>Corr = 0.61(nrLS)</p>
						</c>
						<c ca="left">
							<p>Side chain mobility</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>VINM940101 [67]</p>
						</c>
						<c ca="left">
							<p>-11 to -1</p>
						</c>
						<c ca="left">
							<p>-2: Corr = 0.72(prof)</p>
							<p>-3: Corr = 0.75(prof)</p>
							<p>-4: Corr = 0.78(nrLS)</p>
							<p>-5: Corr = 0.82(nrLS)</p>
							<p>-6: Corr = 0.84(nrLS)</p>
							<p>-7: Corr = 0.79(nrLS)</p>
							<p>-8: Corr = 0.74(prof)</p>
							<p>-9: Corr = 0.82(nrLS)</p>
							<p>-10: Corr = 0.84(nrLS)</p>
							<p>-11: Corr = 0.79(nrLS)</p>
							<p>Rest: continuing deviation from Uniref50 average</p>
						</c>
						<c ca="left">
							<p>Normalized flexibility average</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>KRIW790102 [68]</p>
						</c>
						<c ca="left">
							<p>-11 to -1</p>
						</c>
						<c ca="left">
							<p>-2: Corr = 0.76(prof)</p>
							<p>-6: Corr = 0.83(nrLS)</p>
							<p>-7: Corr = 0.83(nrLS)</p>
							<p>-8: Corr = 0.76(prof)</p>
							<p>Rest: continuing deviation from Uniref50 average</p>
						</c>
						<c ca="left">
							<p>Fraction of site occupied with water</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>Buried helix (see Materials and methods)</p>
						</c>
						<c ca="left">
							<p>-20 to -1</p>
						</c>
						<c ca="left">
							<p>Remove false positives</p>
						</c>
						<c ca="left">
							<p>Helix with strongly hydrophobic sides folds back to protein core and reduces flexibility and accessibility of C-terminus</p>
						</c>
					</r>
				</tblbdy>
				<tblfn>
					<p>Corr, correlation; LS, learning set; nrLS, nonredundant; prof, profile.</p>
				</tblfn>
			</tbl>
			<tbl id="T2">
				<title>
					<p>Table 2</p>
				</title>
				<caption>
					<p>Physical property terms in the GGT1 scoring function</p>
				</caption>
				<tblbdy cols="4">
					<r>
						<c ca="left">
							<p>Property</p>
						</c>
						<c ca="left">
							<p>Position</p>
						</c>
						<c ca="left">
							<p>Rationale</p>
						</c>
						<c ca="left">
							<p>Explanation</p>
						</c>
					</r>
					<r>
						<c cspan="4">
							<hr/>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ARGP820103 [62]</p>
						</c>
						<c ca="left">
							<p>+3</p>
						</c>
						<c ca="left">
							<p>Corr = 0.8(prof)</p>
						</c>
						<c ca="left">
							<p>Membrane-buried preference, lipid contact when entering binding pocket</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>LEVM760105 [69]</p>
						</c>
						<c ca="left">
							<p>+1 to +3</p>
						</c>
						<c ca="left">
							<p>Fisher = 1.36</p>
						</c>
						<c ca="left">
							<p>Size limitation (radius of gyration of side-chain)</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>YUTK870101 [70]</p>
						</c>
						<c ca="left">
							<p>+1 to +3</p>
						</c>
						<c ca="left">
							<p>Fisher = 1.38</p>
						</c>
						<c ca="left">
							<p>Hydrophobicity compensation (Unfolding Gibbs energy in water, pH7.0)</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ZVEL_CHARG [64]</p>
						</c>
						<c ca="left">
							<p>+1 to +3</p>
						</c>
						<c ca="left">
							<p>LS composition</p>
						</c>
						<c ca="left">
							<p>General charge penalty</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ZVEL_CHNEG [64]</p>
						</c>
						<c ca="left">
							<p>+1 to +3</p>
						</c>
						<c ca="left">
							<p>LS composition</p>
						</c>
						<c ca="left">
							<p>Special negative charge penalty</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>ZVEL_ALI_1 [64]</p>
						</c>
						<c ca="left">
							<p>+1 and +2</p>
						</c>
						<c ca="left">
							<p>+2: Corr = 0.87(prof)</p>
							<p>+1: continuing deviation from Uniref50 average</p>
						</c>
						<c ca="left">
							<p>Amino-acid property: aliphatic</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>LIFS790102 [66]</p>
						</c>
						<c ca="left">
							<p>+1 and +2</p>
						</c>
						<c ca="left">
							<p>+2: Corr = 0.77(prof)</p>
							<p>+1: continuing deviation from Uniref50 average</p>
						</c>
						<c ca="left">
							<p>Preference for extended conformations</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>FAUJ880101 [71]</p>
						</c>
						<c ca="left">
							<p>-1 and +2</p>
						</c>
						<c ca="left">
							<p>Fisher = 1.52</p>
						</c>
						<c ca="left">
							<p>Size, bulkiness (residues although 10 &#197; apart, face to same side of base pair)</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>FINA910103 [72]</p>
						</c>
						<c ca="left">
							<p>-1</p>
						</c>
						<c ca="left">
							<p>Corr = 0.75(prof)</p>
						</c>
						<c ca="left">
							<p>Helix termination (for example, K, S favored, D,E,L,I,V disfavored)</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>KARP850103 [73]</p>
						</c>
						<c ca="left">
							<p>-7 to-1</p>
						</c>
						<c ca="left">
							<p>-1: Corr = 0.69(prof)</p>
							<p>-2: Corr = 0.70(prof)</p>
							<p>-3: Corr = 0.71(prof)</p>
							<p>-4: Corr = 0.74(nrLS)</p>
							<p>-5: Corr = 0.75(prof)</p>
							<p>-6: Corr = 0.70(nrLS)</p>
							<p>-7: Corr = 0.78(nrLS)</p>
						</c>
						<c ca="left">
							<p>Flexibility (GGT1 lysine preference)</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>VINM940101 [67]</p>
						</c>
						<c ca="left">
							<p>-11 to -1</p>
						</c>
						<c ca="left">
							<p>-4: Corr = 0.72(prof)</p>
							<p>-5: Corr = 0.82(prof)</p>
							<p>-6: Corr = 0.84(nrLS)</p>
							<p>-7: Corr = 0.75(nrLS)</p>
							<p>-8: Corr = 0.77(nrLS)</p>
							<p>-9: Corr = 0.68(prof)</p>
							<p>-10: Corr = 0.86(prof)</p>
							<p>Rest: continuing deviation from Uniref50 average</p>
						</c>
						<c ca="left">
							<p>Normalized flexibility average</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>KRIW790102 [68]</p>
						</c>
						<c ca="left">
							<p>-11 to -1</p>
						</c>
						<c ca="left">
							<p>-3: Corr = 0.70(prof)</p>
							<p>-4: Corr = 0.73(prof)</p>
							<p>-5: Corr = 0.84(prof)</p>
							<p>-6: Corr = 0.81(prof)</p>
							<p>-7: Corr = 0.83(nrLS)</p>
							<p>-8: Corr = 0.85(nrLS)</p>
							<p>-9: Corr = 0.76(prof)</p>
							<p>-10: Corr = 0.86(prof)</p>
							<p>Rest: continuing deviation from Uniref50 average</p>
						</c>
						<c ca="left">
							<p>Fraction of site occupied with water</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>Buried helix (see Materials and methods)</p>
						</c>
						<c ca="left">
							<p>-20 to -1</p>
						</c>
						<c ca="left">
							<p>Remove false positives</p>
						</c>
						<c ca="left">
							<p>Helix with strongly hydrophobic sides folds back to protein core and reduces flexibility and accessibility of carboxy terminus</p>
						</c>
					</r>
				</tblbdy>
			</tbl>
			<p>The major difference between FT and GGT1 substrates remains at position +3 (the X in CaaX). Whereas a broad variety of residues are allowed in motifs recognized by FT (including several substrates with leucine at +3), mainly leucine and methionine appear to be preferred by GGT1 in agreement with experimental evidence <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Interestingly, position +3 correlates (FT, 0.7; GGT1, 0.8) with a physical property that measures membrane-buried preference parameters (see Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>). This feature does not seem to be important to support membrane interaction at a later stage for the protein, as the three carboxy-terminal residues (-aaX) are often cleaved off in a further processing step after attachment of the anchor <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. However, hydrophobicity and volume of position +3 appear important for interaction with the binding pocket because of the rather lipophilic character of the latter (isoprenyl anchor on one side and hydrophobic residues on the others). The importance of position +3 for specificity between FT and GGT1 is further strengthened by differing conservation of residues in the binding pockets of the respective enzymes (Figure <figr fid="F2">2</figr>). Not surprisingly, the whole region of the binding pocket harboring the end of the prenylpyrophosphate (geranylgeranyl [C20] is one isoprene unit longer than farnesyl [C15]) and the X of the CaaX box (position +3) appear to comprise the major differences in residue conservation (Figure <figr fid="F2">2</figr>).</p>
			<fig id="F2">
				<title>
					<p>Figure 2</p>
				</title>
				<caption>
					<p>The two CaaX prenyltransferases</p>
				</caption>
				<text>
					<p>The two CaaX prenyltransferases. <b>(a) </b>Ribbon representations of FT (PDB 1D8D [75]) and GGT1 (PDB 1N4Q [76]); pink, alpha subunit; yellow, beta subunit. <b>(b) </b>The prenylpyrophosphates (green) and CaaX tetrapeptides (blue) inside the binding pockets with enzyme-specific conservation (conservation in FT or GGT1 minus conservation in joined FT+GGT1 alignment) mapped to binding-pocket surface. Increasing conservation difference is shaded from white to yellow to red. FPP, farnesyl-, GGPP, geranylgeranylpyrophosphate. The alignment of the sequences of these proteins is shown in Figure 6. Visualized with Swiss-Pdb Viewer [59].</p>
				</text>
				<graphic file="gb-2005-6-6-r55-2"/>
			</fig>
			<p>Using the Fisher criterion (see Materials and methods), interpositional correlations of residue sizes within positions +1, +2 and +3 (the carboxy-terminal three residues of the CaaX box that are buried in the binding pocket) from both FT and GGT1 substrates have been identified. Often, when a very large residue occurs at specific positions, neighboring residues compensate to obey the overall physicochemical constraints (for example, size limitation) in the binding pocket. Similarly, compensatory effects appear to exist regarding hydrophobicity between positions +1 and +3 in FT and between +1, +2 and +3 in GGT1 substrates (see Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>). Compensatory effects also seem responsible for the toleration of even large positively charged residues at positions +1 or +2, if the other residues are small enough to accommodate the whole peptide in the binding pocket. On the other hand, negative charges are apparently incompatible with the substrate recognition motif at these positions.</p>
		</sec>
		<sec>
			<st>
				<p>Extension of the CaaX prenylation motif by a flexible linker region</p>
			</st>
			<p>While the requirement for specific amino acids at single positions appears to be marginal outside of the CaaX box, physicochemical constraints that extend up to 11 residues amino-terminal from the modified cysteine can be found (Figure <figr fid="F1">1</figr>, Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>). At position -1 of the motif, there begins a pronounced tendency for residues with either small or flexible hydrophilic side chains. GGT1 especially appears to prefer amino acids like serine or lysine at this position. In general, GGT1 substrates have a higher number of lysines within positions -1 and -7 compared with the FT substrates.</p>
			<p>The hydrophilic linker region with correlations over multiple positions to several hydrophobicity- and flexibility-related property scales might be required to allow accessibility of the carboxy terminus for the lipid-attaching enzymes. Indeed, in several resolved structures of <it>in vivo </it>prenylated GTPases, secondary structural elements such as helices that stabilize the fold of the protein are typically found only at the amino-terminal side of that linker region (beginning of helix at positions -12 (PDB identifier 1FTN), -13 (PDB 1MH1), -15 (PDB 1AM4), -12 (PDB 1A4R)). In the structure of a G protein gamma subunit, the linker region also appears to be extended and wrapped around the beta subunit in the heterotrimeric G protein signaling complex (PDB 1GG2). It needs to be emphasized that the linker region must not necessarily be in an unstructured conformation after the anchor has been attached (see also carboxy-terminal helix in structure PDB 1F5N of human 67 kDa guanylate binding protein 1 <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>), as folding back or lipid-mediated interaction with other proteins or membranes can also induce changes in the three-dimensional structure of the linker region. However, there appears to be a requirement for the ability to easily unfold/fold into flexible and more extended conformations that allow the carboxy terminus to be accessed and modified by the prenyltransferases. It is noteworthy that this length estimation of a flexible, hydrophilic linker is consistent with earlier findings in the GPI anchor <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, myristoylation <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> and PTS1 targeting <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> motifs. Hence, the actual motif length of substrates for CaaX prenylation appears longer than previously thought (total 15 residues = 4 CaaX + 11 linker).</p>
		</sec>
		<sec>
			<st>
				<p>Prediction function and validation</p>
			</st>
			<p>Following the approach already applied to the prediction of GPI and myristoyl anchors and PTS1-mediated targeting <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>, a scoring function measuring compliance with the prenylation motif separately for the enzymes FT and GGT1, respectively, has been constructed (see Materials and methods). In brief, the composite prediction function <it>S </it>consists of a term <it>S</it><sub>profile </sub>scoring a query sequence against the redundancy-corrected profile of the learning-set sequences and another term <it>S</it><sub>ppt </sub>that penalizes deviation from the physicochemical motif requirements.</p>
			<p><b><it>S </it></b>= <b><it>S</it></b><sub>profile </sub>+ <b><it>S</it></b><sub>ppt</sub></p>
			<p>The term <it>S</it><sub>profile </sub>distinguishes the three positions +1, +2 and +3 of the CaaX box as well as the linker region (-1 to -11). <it>S</it><sub>ppt </sub>comprises a sum of terms that are constructed from the physical property requirements for FT and GGT1 substrates that were outlined in the section describing the motif refinement (and listed in Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr> together with their rationale for inclusion in <it>S</it><sub>ppt</sub>).</p>
			<p>The threshold for a query protein to be a predicted farnesylation or geranylgeranylation target by FT or GGT1, respectively, is set to include all sequences in the learning set. Hence, the self-consistencies or upper bounds of sensitivities of the FT and GGT1 predictors are 100%. Additionally, the robustness of the method has been cross-validated in jackknife tests (see Materials and methods). In the cross-validation over the complete scoring function, the rates of finding known substrates after excluding them and their close homologs from the learning procedure (and, therefore, lower bounds for sensitivities) were 92.6% for FT and 98.6% for GGT1, respectively.</p>
			<p>As required for a good predictor <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, the scores are translated into probabilities of false-positive prediction. For this purpose, a sigmoidal function (analytically based on the extreme-value distribution) is fitted to the distribution of score values calculated from non-prenylatable proteins (see Materials and methods). The general probabilities of false-positive prediction (that complement the specificities to 100%) are estimated to be 0.11% for the FT and 0.02% for the GGT1 predictor, respectively.</p>
			<sec>
				<st>
					<p>Capability to distinguish FT and GGT1 substrates</p>
				</st>
				<p>Previously, the assignment of CaaX box substrate proteins to either FT or GGT1 has been based mainly on the identity of the final residue in the motif (position +3) where FT allows several amino-acid types and GGT1 clearly prefers leucine <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B27">27</abbr></abbrgrp>. This view has not changed but it has become clear that several substrates with leucine at position +3 can also be modified (if only to a lesser extent) by FT and not only GGT1. For example, <it>in vitro </it>studies have shown that motifs like CVIL, CVLL, CAIL and CCIL (single-letter amino-acid code) are valid for FT as well <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. Mutation of the CVIA motif of yeast A-factor to CVIL results in geranylgeranylated as well as farnesylated proteins <it>in vivo </it><abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Also, RhoB (with a CKVL motif) is known to be both farnesylated and geranylgeranylated <it>in vivo </it><abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. Similarily, substrate proteins ending with phenylalanine, such as the CVIF of R-Ras2/TC21, are not specific to either enzyme and can be substrates to FT and GGT1 <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
				<p>In the same way that FT can accept CaaX box motifs ending in leucine and phenylalanine, GGT1 appears to tolerate methionine at this position, which was previously thought to direct farnesylation. This has important consequences in the case of the oncoprotein K-Ras (in variants with CVIM and CIIM motifs) which becomes geranylgeranylated <it>in vivo </it>when farnesyltransferase is inhibited <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
				<p>As we have experienced with our earlier predictors for myristoylation and PTS1 targeting, we find even some correlations of the prediction scores with experimentally measured substrate-enzyme affinities. Interestingly, the scores of the GGT1 predictor give better agreement with the experimental data when divided by 3, in agreement with a threefold lower <it>in vivo </it>activity of GGT1 compared to FT <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. To estimate the capability of the FT and GGT1 predictors to model the overlapping but distinct substrate specificities, we analyzed a set of heterogeneous substrate motifs that have been measured under the same experimental conditions for their affinities to either FT or GGT1 <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and we tried to correlate these experimental data with our prediction scores. The set of motifs (CVLS, CIIS, CIIC, CVLF, CVIM, CAIM, CAIV, CAII, CAIL, CVVL, CIIL, and CTIL) contains a large fraction of examples that have been previously shown to be cross-reactive between FT and GGT1 or where the assignment based on simple heuristics depending on hydrophobicity of the final residue fails. In Figure <figr fid="F3">3</figr>, we have plotted the difference of predicted FT and GGT1 scores against the difference of experimentally measured logarithmic affinities for FT and GGT1. A correlation of 0.74 indicates that the theoretical interaction model implemented in the prediction function at least semi-quantitatively resembles the relative substrate specificities between FT and GGT1.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Correlation between predicted and experimental FT/GGT1 substrate selectivity</p>
					</caption>
					<text>
						<p>Correlation between predicted and experimental FT/GGT1 substrate selectivity. The correlation of the difference between predicted FT and GGT1 scores with the difference of the experimentally measured logarithmic affinities for FT and GGT1 of the same substrates is plotted.</p>
					</text>
					<graphic file="gb-2005-6-6-r55-3"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Prediction of prenylation by GGT2</p>
				</st>
				<p>Unlike FT and GGT1, substrate recognition by GGT2 is less dependent on strictly defined carboxy-terminal motifs, but on the complex formation of the substrate with an escort protein <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. As illustrated in Figure <figr fid="F4">4</figr>, the substrate-escort protein complex then binds to GGT2 (consisting of the alpha and beta subunit typical of prenyltransferases) and, thereby, positioning the flexible substrate carboxy terminus towards the site of modification. Typically, the carboxy-terminal arrangement of cysteines is -XXXCC, -XXCXC, -XXCCX, -XCCXX or -CCXXX and, if available, both cysteines in such a motif will be geranylgeranylated. Currently, only the prenylation of Rab GTPases <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> with the help of Rab escort proteins (REP; two copies in higher organisms, otherwise only one copy) is known for the enzyme GGT2 which is, therefore, also called Rab geranylgeranyltransferase. Reports of lipid modification of fungal casein kinase I apparently represent carboxy-terminal palmitoylation <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> rather than the earlier postulated GGT2 prenylation <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>Determinants of GGT2 prenylation</p>
					</caption>
					<text>
						<p>Determinants of GGT2 prenylation. <b>(a) </b>Sequence logos [74] of Ras superfamily members around part of the Rab-REP interaction site (colored red in the otherwise yellow GTPase structure). <b>(b) </b>Structural model of the Rab-REP-GGT2 prenylation complex based on PDB entries 1LTX [77] and 1VG0 [4]. REP1 (green) has a prenyl-binding pocket which is proposed to be involved in the dual geranylgeranylation mechanism (bound geranylgeranyl is shown in green). However, the catalytic attachment to the substrate cysteines takes place in the center of the GGT2 alpha-beta complex (light and dark blue) where the prenylpyrophosphate that will be transferred is also bound (blue space-filling representation, zinc in red). The structure was visualized using Swiss-Pdb Viewer [59].</p>
					</text>
					<graphic file="gb-2005-6-6-r55-4"/>
				</fig>
				<p>Rab proteins are small GTPases (around 60 different have been identified in humans) <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> that share the general fold of the Ras superfamily as well as conserved residues in the nucleotide-binding site. Distinct motifs have been identified that are specific to the Ras, Rho, or Rab families <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. By virtue of contributing to the binding site of Rabs with their REP, the Rab-specific F3F4 motif can be indirectly used to distinguish possible GGT2 substrates within the Ras superfamily (see sequence logos in Figure <figr fid="F4">4</figr>). However, the REP interaction motif (Rab F3F4) alone could be too short (13 residues) to allow highly sensitive large-scale database scans with thresholds that recognize the learning set (100% self-consistency requires a bit score greater than 5). Interestingly, a search with the final predictor against NCBI's nonredundant database finds only 34 hits with the F3F4 region alone that do not represent Rab proteins or their folds. To avoid these false positives, the hit to the overall alignment of Rab proteins with HMMer <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> (E-value &lt; 0.1) is applied as additional prediction criterion to simulate recognition of the correct fold of related sequences.</p>
				<p>Two alignments (F3F4 region and full length) were therefore constructed and after removal of entries with a maximal redundancy of 90% identity over the whole sequence length (117 of 179 entries annotated in Swiss-Prot remaining), hidden Markov models (HMMs) were created and calibrated. The choice of this methodology for the GGT2 prediction was strongly influenced by the fact that the HMMer <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> algorithm is well established in conservatively detecting fold homologies for globular domains at the sequence level. The final GGT2 prediction algorithm checks the carboxy termini for cysteines (at least one cysteine among the five last residues) and parses the HMMer outputs to combine the searches for final results. Estimates of false-positive prediction can be derived from the HMMer E-values.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>PrePS: Webinterface and EvOluation</p>
			</st>
			<p>The three tools to predict lipid modification by FT, GGT1 and GGT2 are available as Prenylation Prediction Suite (PrePS), which is accessible online <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. Users can submit their query sequences to all three or selections of the single predictors. Details of the profile and physical property terms of the scoring function are provided and can also be used to check and rationalize whether and why certain query sequences or artificial constructs intended for membrane targeting might be less suitable prenylation targets. Additionally, an option is provided that allows the user to retrieve homologs of the query protein from NCBI's nonredundant database using BLASTP and automatically annotates them with their respective PrePS results. From the scores for the different predictors (left screenshot in Figure <figr fid="F5">5</figr>) as well as the alignment of the carboxy termini of homologous sequences (right screenshot in Figure <figr fid="F5">5</figr>), the evolutionary motif conservation can be evaluated (evOluation) and used for further rationalization of the biological importance of the predicted motif.</p>
			<fig id="F5">
				<title>
					<p>Figure 5</p>
				</title>
				<caption>
					<p>Screenshot of the output provided by the PrePS server [39]</p>
				</caption>
				<text>
					<p>Screenshot of the output provided by the PrePS server [39]. On the left is the prediction result for the query protein H-Ras (GenBank <ext-link ext-link-type="gen" ext-link-id="P01112">P01112</ext-link>) and the three prenylating enzymes. On the right, is shown the carboxy-terminal alignment and PrePS predictions of homologs of the query protein for evaluation of evolutionary motif conservation. Note that H-Ras is predicted to be prenylated only by FT, whereas the homologs K-Ras and N-Ras can also be prenylated by GGT1.</p>
				</text>
				<graphic file="gb-2005-6-6-r55-5"/>
			</fig>
		</sec>
		<sec>
			<st>
				<p>Comparison with alternative methods</p>
			</st>
			<p>Until now, the only available tool to predict protein prenylation has been the Prosite <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> search with the pattern PS00294, which is also used in the PSORT II software <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. However, this method can neither predict prenylation by GGT2 nor can it distinguish between modifications by FT or GGT1 and, hence, the attached anchor type. During preparation of this paper, an excellent study by Beese, Casey and colleagues <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> has been published that tries to define rules for substrate selectivity by crystallographic analysis of FT and GGT1 complexed with eight cross-reactive substrates. These detailed descriptions of the binding-pocket interactions of a few selected substrate peptides are in good agreement with the motif characteristics identified in this work. While the information gathered from the structural analysis exceeds the capability of any other purely theoretical method to judge interaction for the specific resolved enzyme-substrate pairs, it is difficult to generalize an interaction model from such a small dataset only on the basis of amino-acid constraints at single motif positions. Hence, applying these rules to a more restrictive Prosite-style pattern fails to identify around 30% of substrates experimentally verified in tetrapeptide interaction assays. When taking a closer look at known substrates that are not recognized by the rules of Beese, Casey and colleagues <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> it becomes apparent that this is mainly due to only a few factors. These are the exclusion of leucine at position +3 for alternative FT substrates (known example CKVL of RhoB), the exclusion of phenylalanine at position +3 for alternative FT substrates (known example CVIF of R-Ras2/TC21), the exclusion of glutamine at position +2 for FT substrates (known example serine/threonine kinase 11 or LKB1 with the motif CKQQ) and the exclusion of methionine at position +3 for alternative GGT1 substrates (known example CVIM of K-Ras). In addition, the rules of Beese, Casey and colleagues <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> assign isoleucine and valine at position +3 to GGT1 but not FT substrates. However, these two amino acids were shown to be valid for both FT and GGT1, with at least comparable affinities <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
			<p>The inadequacy of the Beese, Casey and colleagues <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> motif in finding true-positive examples could be counteracted by loosening the motif description, as is already the case in the original Prosite entry PS00294, which nevertheless fails to predict known substrates with glutamine (LKB1) or proline (hepatitis delta antigen) at position +2. However, any reduction in motif stringency concomitantly results in a dramatic increase in the number of false-positive predictions. Table <tblr tid="T3">3</tblr> compares typical prediction parameters for the different methods, if applicable. Neither the old nor an adjusted Prosite pattern can compete with the performance of PrePS in finding true substrates while, at the same time, only having a minimal number of false positives. The short Prosite patterns also do not take into account the linker region preceding the CaaX box, which is not defined by clear amino-acid type preferences but rather by general physicochemical property restrictions. The answers of Prosite-style predictions are only binary (yes/no), whereas PrePS gives continuous scores that can be split into interpretable motif-region contributions and that are shown to correlate with experimentally measured relative substrate affinities for FT or GGT1, respectively. Furthermore, only PrePS includes prediction of prenylation by GGT2 and provides an evaluation of evolutionary conservation of the prenylation motif among homologs of the query sequence.</p>
			<tbl id="T3">
				<title>
					<p>Table 3</p>
				</title>
				<caption>
					<p>Comparison of prediction performances</p>
				</caption>
				<tblbdy cols="7">
					<r>
						<c>
							<p/>
						</c>
						<c cspan="3" ca="center">
							<p>FT</p>
						</c>
						<c cspan="3" ca="center">
							<p>GGT1</p>
						</c>
					</r>
					<r>
						<c>
							<p/>
						</c>
						<c cspan="3">
							<hr/>
						</c>
						<c cspan="3">
							<hr/>
						</c>
					</r>
					<r>
						<c>
							<p/>
						</c>
						<c ca="center">
							<p>Prosite PS00294</p>
						</c>
						<c ca="center">
							<p>Beese, Casey and colleagues' rules</p>
						</c>
						<c ca="center">
							<p>PrePS FT</p>
						</c>
						<c ca="center">
							<p>Prosite PS00294</p>
						</c>
						<c ca="center">
							<p>Beese, Casey and colleagues' rules</p>
						</c>
						<c ca="center">
							<p>PrePS GGT1</p>
						</c>
					</r>
					<r>
						<c cspan="7">
							<hr/>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>Sensitivity I</p>
						</c>
						<c ca="center">
							<p>85%*</p>
						</c>
						<c ca="center">
							<p>
								<it>72%</it>
							</p>
						</c>
						<c ca="center">
							<p>
								<b>100%</b>
							</p>
						</c>
						<c ca="center">
							<p>95%*</p>
						</c>
						<c ca="center">
							<p>
								<it>67%</it>
							</p>
						</c>
						<c ca="center">
							<p>
								<b>100%</b>
							</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>Sensitivity II</p>
						</c>
						<c ca="center">
							<p>NA</p>
						</c>
						<c ca="center">
							<p>NA</p>
						</c>
						<c ca="center">
							<p>
								<b>92.6/97.9%</b>
								<sup>&#8224;</sup>
							</p>
						</c>
						<c ca="center">
							<p>NA</p>
						</c>
						<c ca="center">
							<p>NA</p>
						</c>
						<c ca="center">
							<p>
								<b>98.6%</b>
							</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>Probability of false positive prediction (POFP) for -CXXX motifs (GenBank sequences)</p>
						</c>
						<c ca="center">
							<p>
								<it>17.1%*</it>
							</p>
						</c>
						<c ca="center">
							<p>9.9%</p>
						</c>
						<c ca="center">
							<p>
								<b>6.3%</b>
							</p>
						</c>
						<c ca="center">
							<p>
								<it>17.1%*</it>
							</p>
						</c>
						<c ca="center">
							<p>10.0%</p>
						</c>
						<c ca="center">
							<p>
								<b>1.2%</b>
							</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>POFP -CXXX 'cytoplasmic'<sup>&#8225;</sup></p>
						</c>
						<c ca="center">
							<p>
								<it>18.2%*</it>
							</p>
						</c>
						<c ca="center">
							<p>8.9%</p>
						</c>
						<c ca="center">
							<p>
								<b>5.1%</b>
							</p>
						</c>
						<c ca="center">
							<p>
								<it>18.2%*</it>
							</p>
						</c>
						<c ca="center">
							<p>8.6%</p>
						</c>
						<c ca="center">
							<p>
								<b>1.4%</b>
							</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>POFP -CXXX 'nuclear'<sup>&#8225;</sup></p>
						</c>
						<c ca="center">
							<p>
								<it>13.9%*</it>
							</p>
						</c>
						<c ca="center">
							<p>10.5%</p>
						</c>
						<c ca="center">
							<p>
								<b>5.5%</b>
							</p>
						</c>
						<c ca="center">
							<p>
								<it>13.9%*</it>
							</p>
						</c>
						<c ca="center">
							<p>9.6%</p>
						</c>
						<c ca="center">
							<p>
								<b>1.1%</b>
							</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>POFP -CXXX 'membrane'<sup>&#8225;</sup></p>
						</c>
						<c ca="center">
							<p>
								<it>17.5%*</it>
							</p>
						</c>
						<c ca="center">
							<p>10.3%</p>
						</c>
						<c ca="center">
							<p>
								<b>3.8%</b>
							</p>
						</c>
						<c ca="center">
							<p>
								<it>17.5%*</it>
							</p>
						</c>
						<c ca="center">
							<p>12.0%</p>
						</c>
						<c ca="center">
							<p>
								<b>0.8%</b>
							</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>POFP --CXXX 'extracellular'<sup>$</sup></p>
						</c>
						<c ca="center">
							<p>
								<it>8.6%*</it>
							</p>
						</c>
						<c ca="center">
							<p>7.9%</p>
						</c>
						<c ca="center">
							<p>
								<b>3.3%</b>
							</p>
						</c>
						<c ca="center">
							<p>8.6%*</p>
						</c>
						<c ca="center">
							<p>
								<it>9.0%</it>
							</p>
						</c>
						<c ca="center">
							<p>
								<b>0.2%</b>
							</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>Overall probability of false positive prediction (GenBank sequences, assuming 1.7% with -CXXX)</p>
						</c>
						<c ca="center">
							<p>
								<it>0.29%*</it>
							</p>
						</c>
						<c ca="center">
							<p>0.16%</p>
						</c>
						<c ca="center">
							<p>
								<b>0.11%</b>
							</p>
						</c>
						<c ca="center">
							<p>
								<it>0.29%*</it>
							</p>
						</c>
						<c ca="center">
							<p>0.17%</p>
						</c>
						<c ca="center">
							<p>
								<b>0.02%</b>
							</p>
						</c>
					</r>
				</tblbdy>
				<tblfn>
					<p>*Prosite pattern PS00294 does not distinguish between prenylation by FT and GGT1.</p>
					<p><sup>&#8224;</sup>Sensitivity rises to 97.9% when the exceptional motif CRPQ of hepatitis delta antigen is removed. <sup>&#8225;</sup>For details see Materials and methods. Sensitivity I is the rate of finding known substrates from described learning set = self-consistency. Sensitivity II is the rate of finding known substrates after their exclusion (including homologs) from the learning set = cross-validation (see Materials and methods). Probabilities of false-positive predictions (POFP) complement the specificities to 100% (Specificity = 100 - POFP). The first listed POFP estimates the rates of false positives among query proteins that have a canonical -CXXX motif (which corresponds to 1.7% of all sequences). Below are estimations of POFPs for subsets of Swiss-Prot proteins that differ in their annotated subcellular localization (see Materials and methods). The final POFP is the estimate for false-positive predictions for all sequences (for example, when analyzing complete proteomes or large databases), independent of existence of a -CXXX motif. Formatting signifies: best (bold), intermediate (plain text), worst (italic) performance.</p>
				</tblfn>
			</tbl>
		</sec>
		<sec>
			<st>
				<p>Medical implications and prediction examples</p>
			</st>
			<p>Farnesyltransferase inhibitors (FTIs) have been developed to prevent prenylation of oncogenic Ras proteins and are currently undergoing phase II and III clinical trials <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. While FTIs have been suggested also to target parasitic diseases <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B43">43</abbr></abbrgrp>, their efficacy as cancer treatments has been found to be ambivalent in respect of different cancer types. This could be due to the alternative prenylation of oncogenic proteins by GGT1 under FT inhibition, such as K-Ras, in contrast to the total inhibition of prenylation for unique FT substrates, such as H-Ras <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B44">44</abbr></abbrgrp>. Identifying these two types of substrate behavior is critical for understanding FTI action as well as identifying their real cellular targets <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp>. One of the applications of PrePS is in the distinction of substrates that are specific to FT (FTI target) or GGT1 or that are modified by both (less affected by FTIs).</p>
			<p>We would like to mention here one example prediction of PrePS for a protein that would be a candidate for a previously unknown FTI target. The human nucleosome assembly protein I-like protein <abbrgrp><abbr bid="B47">47</abbr></abbrgrp> (NAP1-like (GenBank:<ext-link ext-link-type="gen" ext-link-id="NP_004528">NP_004528</ext-link>)) has a CKQQ farnesylation motif that is further retained in mouse, rat, frog, fish, fungi and plants, as predicted by PrePS. This taxonomically widespread evolutionary conservation would rather indicate a relevance of the lipid anchor for the function of this protein, which is part of a family involved in transcriptional activation and chromatin formation, including histone binding <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> and nucleocytoplasmic shuttling <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. The lack of the ability to be alternatively prenylated by GGT1 and, hence, being a unique FT substrate and putative FTI target, is also conserved in the other organisms, possibly pointing to the importance of the specific farnesyl anchor length. It should be noted that this protein is not predicted by the Prosite pattern PS00294 nor by the pattern derived from the rules of a few substrate-enzyme structures <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, but there exist other experimentally verified examples where the same CaaX box motif CKQQ has been shown to be farnesylated (yeast Pex19p <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> and human serine/threonine kinase 11 <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>).</p>
			<p>While this paper was in preparation, farnesylation of the NAP1-like protein has been suggested experimentally through a special tagging and purification technique <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>, giving support to the PrePS prediction. The same analysis, however, also suggests farnesylation of annexin A2 (GenBank accession number <ext-link ext-link-type="gen" ext-link-id="P07355">P07355</ext-link>) terminating in a CGGDD motif, which is not at all predicted by PrePS as it is mechanistically unlikely to be processed by farnesyltransferase. Another rather surprising prediction resulting from the tagging experiment is the farnesylation of Rab21 (Q9UL25), which has a double cysteine motif followed by three additional residues (CCSSG) which, at least formally, resembles a CaaX box. Rab proteins with CaaX boxes such as Rab5 (CCSN), Rab8 (CVLL/CSLL), Rab11 (CQNI) and Rab13 (CSLG) are usually modified by GGT2 <it>in vivo </it><abbrgrp><abbr bid="B6">6</abbr><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr></abbrgrp> but Rab8 and Rab11 were shown also to be modified by GGT1 and FT <it>in vitro </it><abbrgrp><abbr bid="B6">6</abbr><abbr bid="B55">55</abbr></abbrgrp>. PrePS predicts Rab21 to be geranylgeranylated by GGT2, but the prediction limit for farnesylation is not missed by far. The evOluation shows that the Rab21 orthologs in <it>Xenopus </it>(GenBank <ext-link ext-link-type="gen" ext-link-id="AAH60498.1">AAH60498.1</ext-link>) and <it>Drosophila </it>(AAH60498.1) share the double cysteines but their motif is different and shorter by one residue, pointing to a higher importance of the conservation of the cysteine doublet than the rest of the motif. The evOluation, furthermore, shows that Rab5 is the most closely related prenylated Rab-family member. Interestingly, both cysteines in the CCSN CaaX box motif of Rab5 were shown not only to be geranylgeranylated by GGT2 <it>in vivo </it>but are also required for proper localization and function of the GTPase <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>. Hence, a similar scenario for the two cysteines of the Rab21 prenylation motif cannot be excluded.</p>
			<p>A complete analysis of large-scale predictions of prenylated proteins ranked by functional importance as estimated by evolutionary motif conservation and medical implications will be published in a follow-up work.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Correlation of positional amino-acid frequencies with physical property scales</p>
				</st>
				<p>We identified physicochemical requirements for each motif position by correlating 20-dimensional vectors filled with the positional frequencies of occurrence of the 20 amino-acid types in the carboxy-terminally aligned learning set with a library of over 650 amino-acid physical properties <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. This has been done over a largest subset of the learning set with removed redundancy of greater than 40% identity in the last 30 positions (nonredundant learning set = nrLS) and over positional vectors filled with frequencies derived from the profile (= prof) that has been corrected for redundancy with the position-specific independent counts (PSIC) method <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. Such correlations have been estimated previously <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> to be significant for confidence levels &#945; = 0.0025 and &#945; = 0.001 if the values are greater than 0.62 and 0.7, respectively.</p>
			</sec>
			<sec>
				<st>
					<p>Fisher criterion to find interpositional correlations</p>
				</st>
				<p>The Fisher ratio <it>F </it>of the sum of variances of single positions with the variance over multiple positions for pairs and triplets of positions is calculated, allowing gaps of up to two residues between pairs.</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i1.gif"/>
				</p>
				<p>The <it>F </it>values for probabilities <it>p </it>= 0.05 are taken as thresholds for evaluating significance of interdependence of physical properties between motif positions. These are 1.288 for the FT and 1.355 for the GGT1 nonredundant learning set, respectively.</p>
			</sec>
			<sec>
				<st>
					<p>Enzyme-specific binding pocket residue conservation</p>
				</st>
				<p>To identify regions in the FT and GGT1 binding pockets that are characteristic for the respective enzyme, we analyzed the pattern of residue conservation in vicinity to the CaaX substrate peptide and the prenylpyrophosphate (Figures <figr fid="F2">2</figr> and <figr fid="F5">5</figr>). Alignments were created with FT and GGT1 beta subunits of a diverse subset of organisms (see legend of Figure 6) using ClustalX <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. The conservation of alignment positions was measured using al2co <abbrgrp><abbr bid="B58">58</abbr></abbrgrp> with the Henikoff-weighted variance-based options. The enzyme-specific conservation was calculated as conservation values in the FT or GGT1 alignments minus conservation values in a joined FT+GGT1 alignment (Figure <figr fid="F6">6</figr>) and then mapped to the binding pocket surface using a customized script in Swiss-Pdb Viewer <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>. Color transitions from white to yellow to red represent increasing conservation difference, where red signifies the highest enzyme-specific conservation. White parts are not characteristically different between FT and GGT1, but might also be well conserved between the two enzymes. As the observed differences in the binding pocket conservation relate to features of the enzyme and not the substrates, this information could not be taken directly into account in our scoring scheme. Indirectly, however, the resulting image mirrors the requirement for conservation also for the substrate peptides and, thereby, confirms the relative importance of motif positions as estimated by the Shannon entropy (see below).</p>
				<fig id="F6">
					<title>
						<p>Figure 6</p>
					</title>
					<caption>
						<p>Alignment of FT and GGT1 beta subunits (FTb, GGT1b) in the regions of binding-pocket residues (marked with arrow) using ClustalX [57]</p>
					</caption>
					<text>
						<p>Alignment of FT and GGT1 beta subunits (FTb, GGT1b) in the regions of binding-pocket residues (marked with arrow) using ClustalX [57]. Residue ranges shown above and below correspond to the numbering in the PDB structures of rat FT beta (PDB 1D8D) and GGT1 beta (PDB 1N4Q), respectively. Accession numbers are as follows (GenBank unless indicated otherwise): Hs (<it>Homo sapiens</it>) FTb, NP_002019; GGT1b, NP_005014; Mm (<it>Mus musculus</it>) NP_666039; NP_766215; Rn (<it>Rattus norvegicus</it>) PDB 1D8D; 1N4Q; Tn (<it>Tetraodon nigroviridis</it>) CAG09215; CAF904630; Dm (<it>Drosophila melanogaster</it>) NP_650540; NP_525100; Ag (<it>Anopheles gambiae</it>) XP_321357; XP_317045; Ce (<it>Caenorhabditis elegans</it>) NP_506580; NP_496848; At (<it>Arabidopsis thaliana</it>) NP_198844; NP_181487; Sp (<it>Schizosaccharomyces pombe</it>) NP_594251; NP_594142; Sc (<it>Saccharomyces cerevisiae</it>) P22007; NP_011360. Standard ClustalX coloring (according to conserved amino acid type).</p>
					</text>
					<graphic file="gb-2005-6-6-r55-6"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Prediction score function</p>
				</st>
				<p>Construction and calculation of the scoring function essentially follows the methodologies <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B16">16</abbr><abbr bid="B60">60</abbr></abbrgrp> applied to the prediction of GPI anchors <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, myristoylation <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> and PTS1 peroxisomal targeting <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and is summarized shortly below with emphasis on additions and problem-specific variations.</p>
				<p>The composite score function consists of a profile and physical property term.</p>
				<p><b><it>S </it></b>= <b><it>S</it></b><sub>profile </sub>+ <b><it>S</it></b><sub>ppt </sub>&#160;&#160;&#160; (2)</p>
			</sec>
			<sec>
				<st>
					<p>The profile term <it>S</it><sub>profile</sub></p>
				</st>
				<p>The profile matrix is calculated using the PSIC algorithm to account for redundancy in the learning set. The frequency of occurrence <it>f</it>(<it>a</it>,<it>i</it>) of an amino acid type <it>a </it>at a given alignment position <it>i </it>is down-weighted proportionally to the number of other positions that are identical in sequences sharing <it>a </it>at <it>i</it>, resulting in subscores <it>S</it><sub>PSIC</sub>(<it>a</it>,<it>i</it>) representing the natural logarithms of these redundancy-corrected frequencies. These are summed over the respective regions (+3, +2, +1 and -1 to -11)</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i2.gif"/>
				</p>
				<p>before they enter <it>S</it><sub>profile</sub>:</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i3.gif"/>
				</p>
				<p><it>&#945;</it><sub>profile </sub>is a normalization factor to compensate for differing lengths of the regions and is derived from:</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i4.gif"/>
				</p>
				<p><it>C</it><sub>region </sub>is a relative weighting of motif regions. We propose a rational basis for its approximated computation. <it>C</it><sub>region </sub>is derived from the average Shannon entropies of the regions in the alignment of the nonredundant learning set sequences. Shannon's information theoretic entropy has already previously been investigated as measurement of sequence variability in alignment positions and residue conservation <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>. We calculate the average of the exponential of the negative Shannon entropy as conservation measurement</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i5.gif"/>
				</p>
				<p>where <it>n</it><sub><it>a</it>,<it>i </it></sub>is the number of occurrences of amino-acid type <it>a </it>at the investigated position <it>i </it>of the region and <it>N </it>the total number of sequences in the alignment. Only amino-acid types that occur at least once (<it>n</it><sub><it>a</it>,<it>i </it></sub>&#8805; 1) are included in the sum. In order to avoid overly precise values, the final values are rounded up to the last decimal that showed variation in the conservation measure when comparing the small learning set based on Swiss-Prot annotation with the larger learning set described in this work. The position of the absolutely conserved cysteine in the CaaX box would have the maximum conservation value of 1. Table <tblr tid="T4">4</tblr> lists <it>C</it><sub>region </sub>for the regions +1, +2, +3 and -11 to -1.</p>
				<tbl id="T4">
					<title>
						<p>Table 4</p>
					</title>
					<caption>
						<p>Relative weightings of motif positions in profile term</p>
					</caption>
					<tblbdy cols="3">
						<r>
							<c ca="left">
								<p>Position(s)</p>
							</c>
							<c ca="left">
								<p>FT</p>
							</c>
							<c ca="left">
								<p>GGT1</p>
							</c>
						</r>
						<r>
							<c cspan="3">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>-11 to -1</p>
							</c>
							<c ca="left">
								<p>0.07</p>
							</c>
							<c ca="left">
								<p>0.07</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>+1</p>
							</c>
							<c ca="left">
								<p>0.15</p>
							</c>
							<c ca="left">
								<p>0.14</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>+2</p>
							</c>
							<c ca="left">
								<p>0.29</p>
							</c>
							<c ca="left">
								<p>0.41</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>+3</p>
							</c>
							<c ca="left">
								<p>0.17</p>
							</c>
							<c ca="left">
								<p>0.42</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
			</sec>
			<sec>
				<st>
					<p>The physical property term <it>S</it><sub>ppt</sub></p>
				</st>
				<p>The primary role of the physical property term is to exclude query sequences that do not fit into the determined physical property profile of the motif (see Figure <figr fid="F1">1</figr>). Hence, it does not add positive scores for compliance but only penalizes deviations from physical property requirements of the motif. These requirements have been identified as described above (listed in Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>; including rationale for inclusion into <it>S</it><sub>ppt</sub>) and the biophysical meaning discussed earlier.</p>
				<p>As an example, a penalty for a physical property <it>P </it>of the query being greater than the average physical property <it>P</it><sub><it>j </it></sub>over the nonredundant learning set (with <it>&#963;</it><sub><it>j </it></sub>being the corresponding standard deviation of a Gauss-like distribution) can be defined as follows:</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i6.gif"/>
				</p>
				<p>The natural logarithms (to be comparable to the profile score) of these single terms enter <it>S</it><sub>ppt </sub>as a sum with a term-specific weighting factor <it>&#945;</it><sub><it>j </it></sub>that emphasizes varying importance of the single terms.</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i7.gif"/>
				</p>
				<p>As part of <it>S</it><sub>ppt</sub>, the FT and GGT1 score functions also include a penalty for query proteins when the carboxy terminus appears inaccessible as a result of structural constraints. For example, helices with strongly hydrophobic sides are either buried in the structure or fold back to the protein core and reduce the flexibility and accessibility of the carboxy terminus. We recognize these if hydrophobic residues (LIVMFYW) appear in patterns like i,i+3,i+6 (hXXhXXh) or i,i+3,i+7 (hXXhXXXh) or i,i+4,i+7 (hXXXhXXh) within 20 residues preceding the cysteine of the CaaX box.</p>
			</sec>
			<sec>
				<st>
					<p>Cross-validation (jackknife) tests</p>
				</st>
				<p>We have performed three jackknife tests to validate the robustness of our predictors. First, cross-validation over the whole scoring function (<it>S</it><sub>profile </sub>+ <it>S</it><sub>ppt </sub>recalculated after removal of the entry to be validated from the learning set) resulted in sensitivities of 99.4% for FT and 99.8% for GGT1, respectively. This generally indicates that the learning sets are large enough for the method not to suffer from removal of individual entries and, hence it should also be able to find valid motifs not yet included in the learning set. At the same time, we acknowledge that the similarity of sequences within homologous groups in the learning set influence the above estimated sensitivity. However, we emphasize that the values calculated for the profile matrix vary even when only one sequence from a group of homologs is left out, since the method used for profile extraction <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> does take into account such redundancy with both sequence- and position-specific weightings.</p>
				<p>To address the role of homologous sequences remaining in the learning set for jackknife tests, we have extended the cross-validation to the more stringent case where not only the sequence to be predicted is excluded from the learning procedure but also its close homologs (estimated by a threshold of 40% sequence identity over the 30 carboxy-terminal residues). We find that in the case of the predictor for GGT1, the sensitivity of 98.6% is very close to the first jackknife test. For the FT predictor, a sensitivity of 92.6% is obtained in this test, which is only slightly lower than in the other jackknife procedures. The decrease can be almost exclusively attributed to the large group of highly similar sequences of hepatitis delta antigen that all share the uncommon motif CRPQ with both arginine at position +1 and proline at +2 being rather exceptional amino acids. Hence, the CRPQ sequences remain below the prediction threshold in the jackknife test if no example of this motif is present in the learning set. When leaving the CRPQ group out of the cross-validation, the FT predictor is calculated to have a sensitivity of 97.9%.</p>
				<p>A third cross-validation test aims to elucidate whether the 39 parameters (13 terms with weightings, averages and variances, see Equations 7 and 8) introduced through the physical property terms in <it>S</it><sub>ppt </sub>are overfitting the learning data. To exclude bias through similar sequences in homologous groups, the test was executed only over the learning set after removing redundancy (see section on correlation analysis above). <it>S</it><sub>ppt </sub>alone is recalculated after removal of the entry to be validated from the parameter calculation procedure. The obtained sensitivities of 100% for FT and 99.8% for GGT1 indicate that the parameterizations of physical property terms in <it>S</it><sub>ppt </sub>are not overfitting the learning data.</p>
			</sec>
			<sec>
				<st>
					<p>Probability of false-positive prediction</p>
				</st>
				<p>To estimate probabilities of false-positive prediction, a set of sequences had to be defined whose carboxy-terminal amino-acid pattern should not be subject to selection for a valid CaaX prenylation motif. This is fulfilled for sequences in the NCBI nonredundant database lacking a cysteine at the fourth position from the carboxy terminus. Hence, any compliance with motif restrictions of such sequences apart from the fourth last position could only be attributed to random or at least prenylation-independent appearance. Following earlier experience with GPI- and myristoyl-anchor prediction <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>, a polynomial extreme-value distribution function has been fitted to the scores obtained for the described sequence set when ignoring the requirement of a CaaX box cysteine in the procedure. The probability of obtaining a score <it>S </it>greater than a threshold score <it>S</it><sub><it>th </it></sub>can be formulated as follows:</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i8.gif"/>
				</p>
				<p>
					<graphic file="gb-2005-6-6-r55-i9.gif"/>
				</p>
				<p>A polynom of the sixth degree was used to improve the residual fit. Polynoms with degrees higher than six would not result in an increase of the relative improvement of the residual. The probabilities of false-positive prediction for scores at the prediction threshold are extrapolated to 6.3% for FT and 1.2% for GGT1 for sequences that contain a cysteine at the fourth-last position (canonical CaaX box). Given that appearance of a cysteine at this position is rare in databases (1.7%), the independent general probabilities of false-positive prediction by FT and GGT1 for all protein sequences are as low as 0.11% and 0.02%, respectively. This corresponds to specificities of 99.89% and 99.98%.</p>
				<p>Since the subcellular context of a protein can be relevant to judging the likelihood of <it>in vivo </it>prenylation when a corresponding motif has been predicted, we also tried to estimate probabilities of false-positive prediction for differently localized subsets of proteins. These subsets were retrieved from the Swiss-Prot database <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and assigned according to the following keywords in the 'Subcellular Location' comment lines: 'cytoplasmic' (24,284), 'nuclear' (9800), 'membrane' (24,823) and 'extracellular' (509). Parentheses indicate the number of unambiguously annotated examples per subset. Again, only proteins lacking a -CXXX motif were taken into account, to ensure prenylation-independent selection of carboxy-terminal amino-acid residues. Although there appear to be some subcellular localization-specific fluctuations of the probabilities of false-positive prediction (partly due to limited or differing subset sizes), the relative advantages among the methods evaluated seem to remain stable (see Table <tblr tid="T3">3</tblr>).</p>
			</sec>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We thank Boehringer Ingelheim for continuous support. This project has been partly funded by the Fonds zur F&#246;rderung der wissenschaftlichen Forschung &#214;sterreichs (FWF grant P15037), the Austrian National Bank (&#214;sterreichische Nationalbank) and by the bioinformatics contract study 2002-2004 for Bundesministerium fuer Wirtschaft und Arbeit.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Protein prenyltransferases.</p>
				</title>
				<aug>
					<au>
						<snm>Casey</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Seabra</snm>
						<fnm>MC</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1996</pubdate>
				<volume>271</volume>
				<fpage>5289</fpage>
				<lpage>5292</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.271.10.5289</pubid>
						<pubid idtype="pmpid" link="fulltext">8621375</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Protein prenyltransferases.</p>
				</title>
				<aug>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Washietl</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2003</pubdate>
				<volume>4</volume>
				<fpage>212</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">154572</pubid>
						<pubid idtype="pmpid" link="fulltext">12702202</pubid>
						<pubid idtype="doi">10.1186/gb-2003-4-4-212</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Protein prenylation: a pivotal posttranslational process.</p>
				</title>
				<aug>
					<au>
						<snm>Roskoski</snm>
						<fnm>R</fnm>
						<suf>Jr</suf>
					</au>
				</aug>
				<source>Biochem Biophys Res Commun</source>
				<pubdate>2003</pubdate>
				<volume>303</volume>
				<fpage>1</fpage>
				<lpage>7</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0006-291X(03)00323-1</pubid>
						<pubid idtype="pmpid" link="fulltext">12646157</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Structure of the Rab7:REP-1 complex: insights into the mechanism of Rab prenylation and choroideremia disease.</p>
				</title>
				<aug>
					<au>
						<snm>Rak</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Pylypenko</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Niculae</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Pyatkov</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Goody</snm>
						<fnm>RS</fnm>
					</au>
					<au>
						<snm>Alexandrov</snm>
						<fnm>K</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2004</pubdate>
				<volume>117</volume>
				<fpage>749</fpage>
				<lpage>760</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.cell.2004.05.017</pubid>
						<pubid idtype="pmpid" link="fulltext">15186776</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Substrate characterization of the Saccharomyces cerevisiae protein farnesyltransferase and type-I protein geranylgeranyltransferase.</p>
				</title>
				<aug>
					<au>
						<snm>Caplin</snm>
						<fnm>BE</fnm>
					</au>
					<au>
						<snm>Hettich</snm>
						<fnm>LA</fnm>
					</au>
					<au>
						<snm>Marshall</snm>
						<fnm>MS</fnm>
					</au>
				</aug>
				<source>Biochim Biophys Acta</source>
				<pubdate>1994</pubdate>
				<volume>1205</volume>
				<fpage>39</fpage>
				<lpage>48</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8142482</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Prenylation of Rab8 GTPase by type I and type II geranylgeranyl transferases.</p>
				</title>
				<aug>
					<au>
						<snm>Wilson</snm>
						<fnm>AL</fnm>
					</au>
					<au>
						<snm>Erdman</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Castellano</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Maltese</snm>
						<fnm>WA</fnm>
					</au>
				</aug>
				<source>Biochem J</source>
				<pubdate>1998</pubdate>
				<volume>333</volume>
				<fpage>497</fpage>
				<lpage>504</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9677305</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Prediction of post-translational modifications from amino acid sequence: problems, pitfalls, methodological hints.</p>
				</title>
				<aug>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Bioinformatics and Genomes: Current Perspectives. 5.1</source>
				<publisher>Wymondham, UK: Horizon Scientific Press</publisher>
				<editor>Andrade MM</editor>
				<pubdate>2003</pubdate>
				<fpage>81</fpage>
				<lpage>105</lpage>
			</bibl>
			<bibl id="B8">
				<title>
					<p>Prediction of potential GPI-modification sites in proprotein sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>1999</pubdate>
				<volume>292</volume>
				<fpage>741</fpage>
				<lpage>758</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.1999.3069</pubid>
						<pubid idtype="pmpid" link="fulltext">10497036</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>N-terminal N-myristoylation of proteins: prediction of substrate proteins from amino acid sequence.</p>
				</title>
				<aug>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>2002</pubdate>
				<volume>317</volume>
				<fpage>541</fpage>
				<lpage>557</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.2002.5426</pubid>
						<pubid idtype="pmpid" link="fulltext">11955008</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Prediction of peroxisomal targeting signal 1 containing proteins from amino acid sequence.</p>
				</title>
				<aug>
					<au>
						<snm>Neuberger</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Hartig</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>2003</pubdate>
				<volume>328</volume>
				<fpage>581</fpage>
				<lpage>592</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0022-2836(03)00319-X</pubid>
						<pubid idtype="pmpid" link="fulltext">12706718</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.</p>
				</title>
				<aug>
					<au>
						<snm>Boeckmann</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Bairoch</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Apweiler</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Blatter</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Estreicher</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Gasteiger</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Michoud</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>O'Donovan</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Phan</snm>
						<fnm>I</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>365</fpage>
				<lpage>370</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">165542</pubid>
						<pubid idtype="pmpid" link="fulltext">12520024</pubid>
						<pubid idtype="doi">10.1093/nar/gkg095</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>N-terminal N-myristoylation of proteins: refinement of the sequence motif and its taxon-specific differences.</p>
				</title>
				<aug>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>2002</pubdate>
				<volume>317</volume>
				<fpage>523</fpage>
				<lpage>540</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.2002.5425</pubid>
						<pubid idtype="pmpid" link="fulltext">11955007</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Role of the carboxyterminal residue in peptide binding to protein farnesyltransferase and protein geranylgeranyltransferase.</p>
				</title>
				<aug>
					<au>
						<snm>Roskoski</snm>
						<fnm>R</fnm>
						<suf>Jr</suf>
					</au>
					<au>
						<snm>Ritchie</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Arch Biochem Biophys</source>
				<pubdate>1998</pubdate>
				<volume>356</volume>
				<fpage>167</fpage>
				<lpage>176</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/abbi.1998.0768</pubid>
						<pubid idtype="pmpid" link="fulltext">9705207</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>MYRbase: analysis of genome-wide glycine myristoylation enlarges the functional spectrum of eukaryotic myristoylated proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Gouda</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Novatchkova</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Schleiffer</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Schneider</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Sirota</snm>
						<fnm>FL</fnm>
					</au>
					<au>
						<snm>Wildpaner</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hayashi</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2004</pubdate>
				<volume>5</volume>
				<fpage>R21</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">395771</pubid>
						<pubid idtype="pmpid" link="fulltext">15003124</pubid>
						<pubid idtype="doi">10.1186/gb-2004-5-3-r21</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Investigation of S-farnesyl transferase substrate specificity with combinatorial tetrapeptide libraries.</p>
				</title>
				<aug>
					<au>
						<snm>Boutin</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Marande</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Petit</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Loynel</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Desmet</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Canet</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Fauchere</snm>
						<fnm>JL</fnm>
					</au>
				</aug>
				<source>Cell Signal</source>
				<pubdate>1999</pubdate>
				<volume>11</volume>
				<fpage>59</fpage>
				<lpage>69</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0898-6568(98)00032-1</pubid>
						<pubid idtype="pmpid" link="fulltext">10206346</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Prediction of sequence signals for lipid post-translational modifications: insights from case studies.</p>
				</title>
				<aug>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Neuberger</snm>
						<fnm>G</fnm>
					</au>
				</aug>
				<source>Proteomics</source>
				<pubdate>2004</pubdate>
				<volume>4</volume>
				<fpage>1614</fpage>
				<lpage>1625</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/pmic.200300781</pubid>
						<pubid idtype="pmpid" link="fulltext">15174131</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Valid CaaX tetrapeptides at the PrePS site</p>
				</title>
				<url>http://mendel.imp.univie.ac.at/sat/PrePS/tetraTable.html</url>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.</p>
				</title>
				<aug>
					<au>
						<snm>Altschul</snm>
						<fnm>SF</fnm>
					</au>
					<au>
						<snm>Madden</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Schaffer</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>3389</fpage>
				<lpage>3402</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">146917</pubid>
						<pubid idtype="pmpid" link="fulltext">9254694</pubid>
						<pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>An efficient algorithm for large-scale detection of protein families.</p>
				</title>
				<aug>
					<au>
						<snm>Enright</snm>
						<fnm>AJ</fnm>
					</au>
					<au>
						<snm>Van Dongen</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ouzounis</snm>
						<fnm>CA</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2002</pubdate>
				<volume>30</volume>
				<fpage>1575</fpage>
				<lpage>1584</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">101833</pubid>
						<pubid idtype="pmpid" link="fulltext">11917018</pubid>
						<pubid idtype="doi">10.1093/nar/30.7.1575</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Tomii</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Kanehisa</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Protein Eng</source>
				<pubdate>1996</pubdate>
				<volume>9</volume>
				<fpage>27</fpage>
				<lpage>36</lpage>
				<xrefbib>
					<pubid idtype="pmpid">9053899</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidase.</p>
				</title>
				<aug>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Protein Eng</source>
				<pubdate>1998</pubdate>
				<volume>11</volume>
				<fpage>1155</fpage>
				<lpage>1161</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/protein/11.12.1155</pubid>
						<pubid idtype="pmpid" link="fulltext">9930665</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>UniProt: the Universal Protein knowledgebase.</p>
				</title>
				<aug>
					<au>
						<snm>Apweiler</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Bairoch</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>CH</fnm>
					</au>
					<au>
						<snm>Barker</snm>
						<fnm>WC</fnm>
					</au>
					<au>
						<snm>Boeckmann</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Ferro</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Gasteiger</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Lopez</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Magrane</snm>
						<fnm>M</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<issue>32 Database</issue>
				<fpage>D115</fpage>
				<lpage>D119</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">308865</pubid>
						<pubid idtype="pmpid" link="fulltext">14681372</pubid>
						<pubid idtype="doi">10.1093/nar/gkh131</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Crystallographic analysis of CaaX prenyltransferases complexed with substrates defines rules of protein substrate selectivity.</p>
				</title>
				<aug>
					<au>
						<snm>Reid</snm>
						<fnm>TS</fnm>
					</au>
					<au>
						<snm>Terry</snm>
						<fnm>KL</fnm>
					</au>
					<au>
						<snm>Casey</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Beese</snm>
						<fnm>LS</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>2004</pubdate>
				<volume>343</volume>
				<fpage>417</fpage>
				<lpage>433</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.jmb.2004.08.056</pubid>
						<pubid idtype="pmpid" link="fulltext">15451670</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Protein prenyltransferases: anchor size, pseudogenes and parasites.</p>
				</title>
				<aug>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Washietl</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Biol Chem</source>
				<pubdate>2003</pubdate>
				<volume>384</volume>
				<fpage>977</fpage>
				<lpage>989</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1515/BC.2003.110</pubid>
						<pubid idtype="pmpid">12956414</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Structure of human guanylate-binding protein 1 representing a unique class of GTP-binding proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Prakash</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Praefcke</snm>
						<fnm>GJ</fnm>
					</au>
					<au>
						<snm>Renault</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Wittinghofer</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Herrmann</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2000</pubdate>
				<volume>403</volume>
				<fpage>567</fpage>
				<lpage>571</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35000617</pubid>
						<pubid idtype="pmpid" link="fulltext">10676968</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>Motif refinement of the peroxisomal targeting signal 1 and evaluation of taxon-specific differences.</p>
				</title>
				<aug>
					<au>
						<snm>Neuberger</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Hartig</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>2003</pubdate>
				<volume>328</volume>
				<fpage>567</fpage>
				<lpage>579</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0022-2836(03)00318-8</pubid>
						<pubid idtype="pmpid" link="fulltext">12706717</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Sequence dependence of protein isoprenylation.</p>
				</title>
				<aug>
					<au>
						<snm>Moores</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Schaber</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Mosser</snm>
						<fnm>SD</fnm>
					</au>
					<au>
						<snm>Rands</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>O'Hara</snm>
						<fnm>MB</fnm>
					</au>
					<au>
						<snm>Garsky</snm>
						<fnm>VM</fnm>
					</au>
					<au>
						<snm>Marshall</snm>
						<fnm>MS</fnm>
					</au>
					<au>
						<snm>Pompliano</snm>
						<fnm>DL</fnm>
					</au>
					<au>
						<snm>Gibbs</snm>
						<fnm>JB</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1991</pubdate>
				<volume>266</volume>
				<fpage>14603</fpage>
				<lpage>14610</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">1860864</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Chromatographic assay and peptide substrate characterization of partially purified farnesyl- and geranylgeranyltransferases from rat brain cytosol.</p>
				</title>
				<aug>
					<au>
						<snm>Boutin</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Marande</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Goussard</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Loynel</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Canet</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Fauchere</snm>
						<fnm>JL</fnm>
					</au>
				</aug>
				<source>Arch Biochem Biophys</source>
				<pubdate>1998</pubdate>
				<volume>354</volume>
				<fpage>83</fpage>
				<lpage>94</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/abbi.1998.0678</pubid>
						<pubid idtype="pmpid" link="fulltext">9633601</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Consequences of altered isoprenylation targets on a-factor export and bioactivity.</p>
				</title>
				<aug>
					<au>
						<snm>Caldwell</snm>
						<fnm>GA</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>SH</fnm>
					</au>
					<au>
						<snm>Naider</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Becker</snm>
						<fnm>JM</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1994</pubdate>
				<volume>91</volume>
				<fpage>1275</fpage>
				<lpage>1279</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">43140</pubid>
						<pubid idtype="pmpid" link="fulltext">8108401</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Actin' up: RhoB in cancer and apoptosis.</p>
				</title>
				<aug>
					<au>
						<snm>Prendergast</snm>
						<fnm>GC</fnm>
					</au>
				</aug>
				<source>Nat Rev Cancer</source>
				<pubdate>2001</pubdate>
				<volume>1</volume>
				<fpage>162</fpage>
				<lpage>168</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35101096</pubid>
						<pubid idtype="pmpid" link="fulltext">11905808</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Farnesyltransferase inhibitors are inhibitors of Ras but not R-Ras2/TC21 transformation.</p>
				</title>
				<aug>
					<au>
						<snm>Carboni</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Yan</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Cox</snm>
						<fnm>AD</fnm>
					</au>
					<au>
						<snm>Bustelo</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Graham</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Lynch</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Weinmann</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Seizinger</snm>
						<fnm>BR</fnm>
					</au>
					<au>
						<snm>Der</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Barbacid</snm>
						<fnm>M</fnm>
					</au>
					<etal/>
				</aug>
				<source>Oncogene</source>
				<pubdate>1995</pubdate>
				<volume>10</volume>
				<fpage>1905</fpage>
				<lpage>1913</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7761092</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>Direct demonstration of geranylgeranylation and farnesylation of Ki-Ras in vivo.</p>
				</title>
				<aug>
					<au>
						<snm>Rowell</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Kowalczyk</snm>
						<fnm>JJ</fnm>
					</au>
					<au>
						<snm>Lewis</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Garcia</snm>
						<fnm>AM</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1997</pubdate>
				<volume>272</volume>
				<fpage>14093</fpage>
				<lpage>14097</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.272.22.14093</pubid>
						<pubid idtype="pmpid" link="fulltext">9162034</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Prenylation of Rab GTPases: molecular mechanisms and involvement in genetic disease.</p>
				</title>
				<aug>
					<au>
						<snm>Pereira-Leal</snm>
						<fnm>JB</fnm>
					</au>
					<au>
						<snm>Hume</snm>
						<fnm>AN</fnm>
					</au>
					<au>
						<snm>Seabra</snm>
						<fnm>MC</fnm>
					</au>
				</aug>
				<source>FEBS Lett</source>
				<pubdate>2001</pubdate>
				<volume>498</volume>
				<fpage>197</fpage>
				<lpage>200</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0014-5793(01)02483-8</pubid>
						<pubid idtype="pmpid" link="fulltext">11412856</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>The yeast DHHC cysteine-rich domain protein Akr1p is a palmitoyl transferase.</p>
				</title>
				<aug>
					<au>
						<snm>Roth</snm>
						<fnm>AF</fnm>
					</au>
					<au>
						<snm>Feng</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Davis</snm>
						<fnm>NG</fnm>
					</au>
				</aug>
				<source>J Cell Biol</source>
				<pubdate>2002</pubdate>
				<volume>159</volume>
				<fpage>23</fpage>
				<lpage>28</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1083/jcb.200206120</pubid>
						<pubid idtype="pmpid" link="fulltext">12370247</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>A prenylation motif is required for plasma membrane localization and biochemical function of casein kinase I in budding yeast.</p>
				</title>
				<aug>
					<au>
						<snm>Vancura</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Sessler</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Leichus</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Kuret</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1994</pubdate>
				<volume>269</volume>
				<fpage>19271</fpage>
				<lpage>19278</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8034689</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Evolution of the Rab family of small GTP-binding proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Pereira-Leal</snm>
						<fnm>JB</fnm>
					</au>
					<au>
						<snm>Seabra</snm>
						<fnm>MC</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>2001</pubdate>
				<volume>313</volume>
				<fpage>889</fpage>
				<lpage>901</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.2001.5072</pubid>
						<pubid idtype="pmpid" link="fulltext">11697911</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>The mammalian Rab family of small GTPases: definition of family and subfamily sequence motifs suggests a mechanism for functional specificity in the Ras superfamily.</p>
				</title>
				<aug>
					<au>
						<snm>Pereira-Leal</snm>
						<fnm>JB</fnm>
					</au>
					<au>
						<snm>Seabra</snm>
						<fnm>MC</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>2000</pubdate>
				<volume>301</volume>
				<fpage>1077</fpage>
				<lpage>1087</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.2000.4010</pubid>
						<pubid idtype="pmpid" link="fulltext">10966806</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Profile hidden Markov models.</p>
				</title>
				<aug>
					<au>
						<snm>Eddy</snm>
						<fnm>SR</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>1998</pubdate>
				<volume>14</volume>
				<fpage>755</fpage>
				<lpage>763</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/14.9.755</pubid>
						<pubid idtype="pmpid" link="fulltext">9918945</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>PrePS - Prenylation Prediction Suite</p>
				</title>
				<url>http://mendel.imp.univie.ac.at/sat/PrePS</url>
			</bibl>
			<bibl id="B40">
				<title>
					<p>The PROSITE database, its status in 2002.</p>
				</title>
				<aug>
					<au>
						<snm>Falquet</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Pagni</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Bucher</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Hulo</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Sigrist</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Hofmann</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Bairoch</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2002</pubdate>
				<volume>30</volume>
				<fpage>235</fpage>
				<lpage>238</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">99105</pubid>
						<pubid idtype="pmpid" link="fulltext">11752303</pubid>
						<pubid idtype="doi">10.1093/nar/30.1.235</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Better prediction of protein cellular localization sites with the k nearest neighbors classifier.</p>
				</title>
				<aug>
					<au>
						<snm>Horton</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Nakai</snm>
						<fnm>K</fnm>
					</au>
				</aug>
				<source>Proc Int Conf Intell Syst Mol Biol</source>
				<pubdate>1997</pubdate>
				<volume>5</volume>
				<fpage>147</fpage>
				<lpage>152</lpage>
				<xrefbib>
					<pubid idtype="pmpid">9322029</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>Farnesyltransferase inhibitors as anticancer agents: critical crossroads.</p>
				</title>
				<aug>
					<au>
						<snm>Doll</snm>
						<fnm>RJ</fnm>
					</au>
					<au>
						<snm>Kirschmeier</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Bishop</snm>
						<fnm>WR</fnm>
					</au>
				</aug>
				<source>Curr Opin Drug Discov Devel</source>
				<pubdate>2004</pubdate>
				<volume>7</volume>
				<fpage>478</fpage>
				<lpage>486</lpage>
				<xrefbib>
					<pubid idtype="pmpid">15338957</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>Protein farnesyl and N-myristoyl transferases: piggy-back medicinal chemistry targets for the development of antitrypanosomatid and antimalarial therapeutics.</p>
				</title>
				<aug>
					<au>
						<snm>Gelb</snm>
						<fnm>MH</fnm>
					</au>
					<au>
						<snm>Van Voorhis</snm>
						<fnm>WC</fnm>
					</au>
					<au>
						<snm>Buckner</snm>
						<fnm>FS</fnm>
					</au>
					<au>
						<snm>Yokoyama</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Eastman</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Carpenter</snm>
						<fnm>EP</fnm>
					</au>
					<au>
						<snm>Panethymitaki</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Brown</snm>
						<fnm>KA</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>DF</fnm>
					</au>
				</aug>
				<source>Mol Biochem Parasitol</source>
				<pubdate>2003</pubdate>
				<volume>126</volume>
				<fpage>155</fpage>
				<lpage>163</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0166-6851(02)00282-7</pubid>
						<pubid idtype="pmpid" link="fulltext">12615314</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Farnesyltransferase and geranylgeranyltransferase I inhibitors and cancer therapy: lessons from mechanism and bench-to-bedside translational studies.</p>
				</title>
				<aug>
					<au>
						<snm>Sebti</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Hamilton</snm>
						<fnm>AD</fnm>
					</au>
				</aug>
				<source>Oncogene</source>
				<pubdate>2000</pubdate>
				<volume>19</volume>
				<fpage>6584</fpage>
				<lpage>6593</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/sj.onc.1204146</pubid>
						<pubid idtype="pmpid" link="fulltext">11426643</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B45">
				<title>
					<p>Protein farnesylation in mammalian cells: effects of farnesyltransferase inhibitors on cancer cells.</p>
				</title>
				<aug>
					<au>
						<snm>Tamanoi</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Gau</snm>
						<fnm>CL</fnm>
					</au>
					<au>
						<snm>Jiang</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Edamatsu</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Kato-Stankiewicz</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Cell Mol Life Sci</source>
				<pubdate>2001</pubdate>
				<volume>58</volume>
				<fpage>1636</fpage>
				<lpage>1649</lpage>
				<xrefbib>
					<pubid idtype="pmpid">11706990</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>Opinion: Searching for the elusive targets of farnesyltransferase inhibitors.</p>
				</title>
				<aug>
					<au>
						<snm>Sebti</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Der</snm>
						<fnm>CJ</fnm>
					</au>
				</aug>
				<source>Nat Rev Cancer</source>
				<pubdate>2003</pubdate>
				<volume>3</volume>
				<fpage>945</fpage>
				<lpage>951</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nrc1234</pubid>
						<pubid idtype="pmpid" link="fulltext">14737124</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p>Molecular characterization of hNRP, a cDNA encoding a human nucleosome-assembly-protein-I-related gene product involved in the induction of cell proliferation.</p>
				</title>
				<aug>
					<au>
						<snm>Simon</snm>
						<fnm>HU</fnm>
					</au>
					<au>
						<snm>Mills</snm>
						<fnm>GB</fnm>
					</au>
					<au>
						<snm>Kozlowski</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hogg</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Branch</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Ishimi</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Siminovitch</snm>
						<fnm>KA</fnm>
					</au>
				</aug>
				<source>Biochem J</source>
				<pubdate>1994</pubdate>
				<volume>297</volume>
				<fpage>389</fpage>
				<lpage>397</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8297347</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B48">
				<title>
					<p>Preferential binding of the histone (H3-H4)2 tetramer by NAP1 is mediated by the amino-terminal histone tails.</p>
				</title>
				<aug>
					<au>
						<snm>McBryant</snm>
						<fnm>SJ</fnm>
					</au>
					<au>
						<snm>Park</snm>
						<fnm>YJ</fnm>
					</au>
					<au>
						<snm>Abernathy</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Laybourn</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Nyborg</snm>
						<fnm>JK</fnm>
					</au>
					<au>
						<snm>Luger</snm>
						<fnm>K</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>2003</pubdate>
				<volume>278</volume>
				<fpage>44574</fpage>
				<lpage>44583</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.M305636200</pubid>
						<pubid idtype="pmpid" link="fulltext">12928440</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B49">
				<title>
					<p>Involvement of nucleocytoplasmic shuttling of yeast Nap1 in mitotic progression.</p>
				</title>
				<aug>
					<au>
						<snm>Miyaji-Yamaguchi</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kato</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Nakano</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Akashi</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Kikuchi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Nagata</snm>
						<fnm>K</fnm>
					</au>
				</aug>
				<source>Mol Cell Biol</source>
				<pubdate>2003</pubdate>
				<volume>23</volume>
				<fpage>6672</fpage>
				<lpage>6684</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">193709</pubid>
						<pubid idtype="pmpid" link="fulltext">12944491</pubid>
						<pubid idtype="doi">10.1128/MCB.23.18.6672-6684.2003</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B50">
				<title>
					<p>Pex19p, a farnesylated protein essential for peroxisome biogenesis.</p>
				</title>
				<aug>
					<au>
						<snm>Gotte</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Girzalsky</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Linkert</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Baumgart</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Kammerer</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Kunau</snm>
						<fnm>WH</fnm>
					</au>
					<au>
						<snm>Erdmann</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Mol Cell Biol</source>
				<pubdate>1998</pubdate>
				<volume>18</volume>
				<fpage>616</fpage>
				<lpage>628</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">121529</pubid>
						<pubid idtype="pmpid" link="fulltext">9418908</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B51">
				<title>
					<p>Phosphorylation of the protein kinase mutated in Peutz-Jeghers cancer syndrome, LKB1/STK11, at Ser431 by p90(RSK) and cAMP-dependent protein kinase, but not its farnesylation at Cys(433), is essential for LKB1 to suppress cell growth.</p>
				</title>
				<aug>
					<au>
						<snm>Sapkota</snm>
						<fnm>GP</fnm>
					</au>
					<au>
						<snm>Kieloch</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Lizcano</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Lain</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Arthur</snm>
						<fnm>JS</fnm>
					</au>
					<au>
						<snm>Williams</snm>
						<fnm>MR</fnm>
					</au>
					<au>
						<snm>Morrice</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Deak</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Alessi</snm>
						<fnm>DR</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>2001</pubdate>
				<volume>276</volume>
				<fpage>19469</fpage>
				<lpage>19482</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.M009953200</pubid>
						<pubid idtype="pmpid" link="fulltext">11297520</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B52">
				<title>
					<p>A tagging-via-substrate technology for detection and proteomics of farnesylated proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Kho</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>SC</fnm>
					</au>
					<au>
						<snm>Jiang</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Barma</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Kwon</snm>
						<fnm>SW</fnm>
					</au>
					<au>
						<snm>Cheng</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Jaunbergs</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Weinbaum</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Tamanoi</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Falck</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>Y</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2004</pubdate>
				<volume>101</volume>
				<fpage>12479</fpage>
				<lpage>12484</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">515085</pubid>
						<pubid idtype="pmpid" link="fulltext">15308774</pubid>
						<pubid idtype="doi">10.1073/pnas.0403413101</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B53">
				<title>
					<p>Rab11a is modified in vivo by isoprenoid geranylgeranyl.</p>
				</title>
				<aug>
					<au>
						<snm>Gromov</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Celis</snm>
						<fnm>JE</fnm>
					</au>
				</aug>
				<source>Electrophoresis</source>
				<pubdate>1998</pubdate>
				<volume>19</volume>
				<fpage>1803</fpage>
				<lpage>1807</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/elps.1150191043</pubid>
						<pubid idtype="pmpid">9719562</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B54">
				<title>
					<p>Membrane targeting of Rab GTPases is influenced by the prenylation motif.</p>
				</title>
				<aug>
					<au>
						<snm>Gomes</snm>
						<fnm>AQ</fnm>
					</au>
					<au>
						<snm>Ali</snm>
						<fnm>BR</fnm>
					</au>
					<au>
						<snm>Ramalho</snm>
						<fnm>JS</fnm>
					</au>
					<au>
						<snm>Godfrey</snm>
						<fnm>RF</fnm>
					</au>
					<au>
						<snm>Barral</snm>
						<fnm>DC</fnm>
					</au>
					<au>
						<snm>Hume</snm>
						<fnm>AN</fnm>
					</au>
					<au>
						<snm>Seabra</snm>
						<fnm>MC</fnm>
					</au>
				</aug>
				<source>Mol Biol Cell</source>
				<pubdate>2003</pubdate>
				<volume>14</volume>
				<fpage>1882</fpage>
				<lpage>1899</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">165084</pubid>
						<pubid idtype="pmpid" link="fulltext">12802062</pubid>
						<pubid idtype="doi">10.1091/mbc.E02-10-0639</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B55">
				<title>
					<p>Isoprenylation of Rab proteins possessing a C-terminal CaaX motif.</p>
				</title>
				<aug>
					<au>
						<snm>Joberty</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Tavitian</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Zahraoui</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>FEBS Lett</source>
				<pubdate>1993</pubdate>
				<volume>330</volume>
				<fpage>323</fpage>
				<lpage>328</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0014-5793(93)80897-4</pubid>
						<pubid idtype="pmpid" link="fulltext">8375503</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B56">
				<title>
					<p>PSIC: profile extraction from sequence alignments with position-specific counts of independent observations.</p>
				</title>
				<aug>
					<au>
						<snm>Sunyaev</snm>
						<fnm>SR</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Rodchenkov</snm>
						<fnm>IV</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Tumanyan</snm>
						<fnm>VG</fnm>
					</au>
					<au>
						<snm>Kuznetsov</snm>
						<fnm>EN</fnm>
					</au>
				</aug>
				<source>Protein Eng</source>
				<pubdate>1999</pubdate>
				<volume>12</volume>
				<fpage>387</fpage>
				<lpage>394</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/protein/12.5.387</pubid>
						<pubid idtype="pmpid" link="fulltext">10360979</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B57">
				<title>
					<p>The CLUSTAL X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.</p>
				</title>
				<aug>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
					<au>
						<snm>Plewniak</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Jeanmougin</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>4876</fpage>
				<lpage>4882</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">147148</pubid>
						<pubid idtype="pmpid" link="fulltext">9396791</pubid>
						<pubid idtype="doi">10.1093/nar/25.24.4876</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B58">
				<title>
					<p>AL2CO: calculation of positional conservation in a protein sequence alignment.</p>
				</title>
				<aug>
					<au>
						<snm>Pei</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Grishin</snm>
						<fnm>NV</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>700</fpage>
				<lpage>712</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/17.8.700</pubid>
						<pubid idtype="pmpid" link="fulltext">11524371</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B59">
				<title>
					<p>SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling.</p>
				</title>
				<aug>
					<au>
						<snm>Guex</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Peitsch</snm>
						<fnm>MC</fnm>
					</au>
				</aug>
				<source>Electrophoresis</source>
				<pubdate>1997</pubdate>
				<volume>18</volume>
				<fpage>2714</fpage>
				<lpage>2723</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/elps.1150181505</pubid>
						<pubid idtype="pmpid">9504803</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B60">
				<title>
					<p>Prediction of lipid posttranslational modifications and localization signals from protein sequences: big-Pi, NMT and PTS1.</p>
				</title>
				<aug>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Eisenhaber</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Kubina</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Maurer-Stroh</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Neuberger</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Schneider</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Wildpaner</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>3631</fpage>
				<lpage>3634</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">168944</pubid>
						<pubid idtype="pmpid" link="fulltext">12824382</pubid>
						<pubid idtype="doi">10.1093/nar/gkg537</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B61">
				<title>
					<p>Scoring residue conservation.</p>
				</title>
				<aug>
					<au>
						<snm>Valdar</snm>
						<fnm>WS</fnm>
					</au>
				</aug>
				<source>Proteins</source>
				<pubdate>2002</pubdate>
				<volume>48</volume>
				<fpage>227</fpage>
				<lpage>241</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/prot.10146</pubid>
						<pubid idtype="pmpid" link="fulltext">12112692</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B62">
				<title>
					<p>Structural prediction of membrane-bound proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Argos</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Rao</snm>
						<fnm>JK</fnm>
					</au>
					<au>
						<snm>Hargrave</snm>
						<fnm>PA</fnm>
					</au>
				</aug>
				<source>Eur J Biochem</source>
				<pubdate>1982</pubdate>
				<volume>128</volume>
				<fpage>565</fpage>
				<lpage>575</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7151796</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B63">
				<title>
					<p>The nature of the accessible and buried surfaces in proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Chothia</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>1976</pubdate>
				<volume>105</volume>
				<fpage>1</fpage>
				<lpage>12</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0022-2836(76)90191-1</pubid>
						<pubid idtype="pmpid">994183</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B64">
				<title>
					<p>Prediction of protein secondary structure and active sites using the alignment of homologous sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Zvelebil</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Barton</snm>
						<fnm>GJ</fnm>
					</au>
					<au>
						<snm>Taylor</snm>
						<fnm>WR</fnm>
					</au>
					<au>
						<snm>Sternberg</snm>
						<fnm>MJ</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>1987</pubdate>
				<volume>195</volume>
				<fpage>957</fpage>
				<lpage>961</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0022-2836(87)90501-8</pubid>
						<pubid idtype="pmpid">3656439</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B65">
				<title>
					<p>Influence of water on protein structure. An analysis of the preferences of amino acid residues for the inside or outside and for specific conformations in a protein molecule.</p>
				</title>
				<aug>
					<au>
						<snm>Wertz</snm>
						<fnm>DH</fnm>
					</au>
					<au>
						<snm>Scheraga</snm>
						<fnm>HA</fnm>
					</au>
				</aug>
				<source>Macromolecules</source>
				<pubdate>1978</pubdate>
				<volume>11</volume>
				<fpage>9</fpage>
				<lpage>15</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1021/ma60061a002</pubid>
						<pubid idtype="pmpid">621952</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B66">
				<title>
					<p>Antiparallel and parallel beta-strands differ in amino acid residue preferences.</p>
				</title>
				<aug>
					<au>
						<snm>Lifson</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Sander</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1979</pubdate>
				<volume>282</volume>
				<fpage>109</fpage>
				<lpage>111</lpage>
				<xrefbib>
					<pubid idtype="pmpid">503185</pubid>
				</xrefbi