The Protein Origins of Biological Complexity

The aim of our research is to make significant contribution to understanding, first, the nature of changes in protein repertoires that have produced major transitions in biology: those between kingdoms, phyla and even those species whose complexity differs greatly and, second, the molecular mechanisms that, during the course of evolution, have produced these repertoire changes.

This area of research is of interest because the structures and processes that form an organism's anatomy and physiology are largely determined by the proteins encoded in its genes and the regulation of the expression of these genes.

Thus, an essential part of understanding organisms themselves, and why they are different, requires the determination of the nature of the protein repertoires implicit in their genomes and the discovery of how differences in repertoires are related to the different properties of the organisms.

Current Research is being carried out in two areas:

Structural Genomics

Before the genome projects described the number of genes that actually occur in different organisms, there was a common assumption that the number would increase with the complexity of organisms. The discovery that this is not case was a surprise: Caenorhabditis elegans has 20,000 genes, rice has 51,000 and humans have 23,000. In subsquent work Christine Vogel and I showed that there is a small subset of protein families that have frequencies which do correlate with an organism’s complexity and play the major role in its formation.

The aims of our current projects are to understand the nature of these family expansions; to determine when they have occurred, and, if possible, how they have increased complexity.


Principles that Govern the Hierarchical Structure of Proteins

Proteins have a hierarchical structure. The fundamental structural and evolutionary units are the domains of which they are formed. Small proteins are formed by a single domain. Combination of domains, and/or their tandem duplications, can form protein chains that have between two and 200 domains. Protein chains, which may have one or more domain(s), can associate to form oligomeric proteins whose molecular weights can be more than 1,000-fold greater than those of one-domain proteins.

Future research will aim to understand the physical and chemical principles that govern this hierarchy of structural forms.

Research Illustration