Table 2

Properties of some human SET-domain proteins

Chromosomal location

Gene size (kb)

Number of coding exons

Protein size (amino acids)

Domains common to the family in addition to the SET domain

Domains unique to particular members

GenBank accession number


SUV39 family

Pre-SET (9 Cys, 3 Zn), post-SET (CXCX4C)

SUV39H1

Xp11.23

12.4

6

412

4 Cys, chromo

4507321

SUV39H2

10p13

24

6

477 (Mm)*

4 Cys, chromo

9956936

G9a

6p21.33

17.3

28

1,210

E/KR-rich, NRSF-binding, ankyrin repeats

18375637

GLP1 (EuHMT1)

9q34.3

120

25

1,267

Same as G9a

20372683

ESET (SETDB1)

1q21.2

37

21

1,291

Tudor, MBD

505110

CLLL8 (SETDB2)

13q14.2

40

14

719

MBD

13994282

SET1 family

Post-SET (CXCX4C)

MLL1 (HRX, ALL1)

11q23.3

86

36

3,969

AT hook, Bromo PHD, CXXC

1170364

HRX2 (MLL4)

19q13.12

20

37

2,715

Same as above

12643900

ALR (MLL2)

12q13.12

34

54

5,262

PHD, ring finger

2358285

MLL3

7q36.1

299

58

4,911

PHD, ring finger

21427632

SET1 (ASH2)

16p11.2

26

18

1,707

RRM, poly-S/E/P

6683126

SET1L

12q24.31

14

11

1,092 (Mm)*

RRM, poly-S/E/P

23468263

SET2 family

Pre-SET (7-9 Cys); post-SET (CXCX4C)

WHSC1 (NSD2)

4q16.3

79

21

1,365

PWWP, PHD, HMG, ring finger

6683809

WHSCL1 (NSD3)

8p12

73

23

1,437

PWWP, PHD, ring finger

13699811

NSD1

5q35.3

160

23

2,696

PWWP, PHD, ring finger

19923586

HIF1 (HYPB)

3p21.31

106

19

2,061

WW

12697196

ASH1

1q22

184

27

2,969

AT hook, bromo, BAH, PHD

7739725

RIZ family

RIZ (PRDM2)

1p36.21

86

9

1,719

C2H2 zinc finger

9955379

BLIMP1 (PRDM1)

6q21

19

6

789

C2H2 zinc finger

3493158

SMYD family

Post-SET (CXCX2C)

SMYD3

1q44

758

12

428

Zf-MYND

30913569

SMYD1

2p11.2

43

9

490

Zf-MYND

38093643

EZ family

Pre-SET (~15 Cys)

EZH1

17q21.2

26

19

747

2 SANT

3334182

EZH2

7q36.1

40

19

746

2 SANT

3334180

SUV4-20 family

Post-SET (CXCX2C)

SUV4-20H1

11q13.2

57

9

876

50659081

SUV4-20H2

19q13.42

8

8

462

31543168

Others

SET7/9

4q31.1

45

8

366

MORN

25091213

SET8 (PR-SET7)

12q24.31

26

8

393

25091219


The seven families of SET-domain proteins are classified according to the sequences surrounding their SET domain. *Complete human SUV39H2 and SET1L cDNAs are not available in current databases, but partial cDNA and genomic sequences corresponding to the mouse sequences (Mm) are present. For the pre-SET and post-SET domains, the number and (if known) the arrangement of cysteines in the domain is given. Domain abbreviations and definitions: ankyrin repeats, tandemly repeated modules of about 33 amino acids; AT hook, DNA binding motif with a preference for A/T-rich regions; BAH, Bromo adjacent homology domain; bromo, bromodomain, which can interact specifically with acetylated lysines; chromo, chromatin organization modifier domain; CXXC, domain with two cysteines separated by two amino acids; E/KR-rich, glutamine- or lysine/arginine-rich domains; HMG, high mobility group domain; MBD, methyl-binding domain; MORN, membrane occupation and recognition nexus repeat; NRSF-binding, binds neuron-restrictive silencing factor/repressor element 1 silencing transcription factor; PHD, folds into an interleaved type of Zn-finger chelating two Zn ions; poly-S/E/P, runs of serine, glutamate or proline; PWWP, domain including a conserved Pro-Trp-Trp-Pro motif; RRM, RNA recognition motif; SANT, DNA-binding domain that specifically recognizes the sequence YAAC(G/T)G; Tudor, domain of unknown function present in several RNA-binding proteins; WW, contains two highly conserved tryptophans and binds proline-rich peptide motifs; Zf-MYND, 'myeloid, Nervy, DEAF-1' domain consisting of a cluster of cysteine and histidine residues.

Dillon et al. Genome Biology 2005 6:227   doi:10.1186/gb-2005-6-8-227