Supra-domains - evolutionary units larger than single protein domains

Christine Vogel1 *,Carlo Berzuini2, Matthew Bashton1, Julian Gough3, Sarah A. Teichmann1

1MRC Laboratory of Molecular Biology, Cambridge, CB2 2QH, UK; 2MRC Biostatistics Unit, Cambridge, CB2 2QH, UK; 3Genome Exploration Research Group, RIKEN Genomic Sciences Centre, W121 1-7-22 Suehiro-cho, Tsurumi-ki, Yokohama 230-0045, Japan , and Department of Structural Biology, Fairchild bldg, D109, Stanford, CA 94305-5126, USA.

* To whom correspondence should be addressed: cvogel[at]


Domains are the evolutionary units that proteins are composed of. Most proteins consist of several domains, and we examined to what extent domain combinations are conserved in different multi-domain proteins in the three kingdoms of life. We called domain combinations that occur with different partner domains supra-domains; and an example is shown here.

We characterised these supra-domains in 131 genomes with respect to

Background information

... on the genomes, domains and protein families used in this work is provided here.

Supplementary Figures and Tables

Date and Results

The 5-digit identifiers, occurrence, versatility of domain combinations (supra-domains and over-represented supra-domains) are provided as well as information on whether they have

Two-domain combinations

Three-domain combinations


all 2368 supra-domains

all 935 supra-domains

( Key)

- top 200 most versatile supra-domains

- top 200 most versatile supra-domains

... with many N- and C-terminal partners. (Key)

- top 200 most duplicated supra-domains

- top 200 most duplicated supra-domains

... with many sequences. (Key)

1203 over-represented

166 over-represented

... significant P-values (duplets: E: < 0.0081, B: <0.008, A: < 0.006; triplets: A: none, B,E: <0.07) ( Key)

raw results

raw results

... for supra-domains with R2 and P-values. (Key)

230 supra-domains over-represented in all three kingdoms of life


... significant P-values (duplets: E: < 0.0081, B: <0.008, A: < 0.006; triplets: A: none, B,E: <0.07) (Key)

Discontinuous domains as detected so far.

