Christine
Vogel1*, Carlo Berzuini2,3 Matthew Bashton1,
Julian Gough4 and Sarah A. Teichmann1*
1MRC Laboratory of Molecular
Biology, Hills Road, Cambridge CB2 2QH, UK,
2MRC Biostatistics Unit,
Institute of Public Health, Cambridge CB2 2SR, UK,
3Dipartimento di Informatica e
Sistemistica, University of Pavia, 27100 Pavia, Italy
and 4Genome
Exploration Research Group, RIKEN Genomic Sciences Centre, W121 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan & Department of
Structural Biology, Fairchild bldg, D109, Stanford, CA 94305-5126, U.S.A.
*Corresponding authors: cvogel {at} mrc-lmb.cam.ac.uk, sat {at} mrc-lmb.cam.ac.uk
|
|
Number |
Sequences % |
Domain architectures % |
Domain combinations % |
|
Two-domain
combinations |
|
|
|
|
|
All |
9,398 |
44 |
72 |
|
|
in PDB |
616 |
30 |
31 |
63 |
|
not in PDB |
8,782 |
18 |
55 |
37 |
|
Supra |
2,368 |
40 |
54 |
|
|
in PDB |
491 |
29 |
31 |
62 |
|
not in PDB |
1,877 |
|
|
28 |
Over-represented |
1,203 |
38 |
47 |
84 |
|
in PDB |
456 |
29 |
31 |
61 |
|
not in PDB |
747 |
11 |
25 |
23 |
|
Top 200 most
duplicated |
200 |
28 |
27 |
|
|
in PDB |
161 |
|
|
48 |
|
not in PDB |
39 |
|
|
8 |
|
Top 200 most
versatile |
200 |
12 |
30 |
|
|
in PDB |
99 |
|
|
31 |
|
not in PDB |
101 |
|
|
8 |
|
Three-domain
combinations |
|
|
|
|
|
All |
4,323 |
12 |
30 |
|
|
in PDB |
217 |
7 |
10 |
|
|
not in PDB |
4106 |
7 |
24 |
|
|
Supra |
935 |
10 |
22 |
|
|
in PDB |
150 |
6 |
10 |
|
|
not in PDB |
785 |
5 |
15 |
|
|
Over-represented |
166 |
3 |
9 |
|
|
in PDB |
37 |
2 |
5 |
|
|
not in PDB |
129 |
1 |
6 |
|
|
Top 200 most
duplicated |
200 |
|
|
|
|
in PDB |
107 |
6 |
10 |
|
|
not in PDB |
93 |
3 |
7 |
|
|
Top 200 most
versatile |
200 |
|
|
|
|
in PDB |
45 |
4 |
8 |
|
|
not in PDB |
155 |
2 |
8 |
|