Mapping Protein Family Interactions:
Intra- and Intermolecular Interaction Repertoires are Distinct
 
 

Jong Park1, Michael Lappe1 and Sarah A. Teichmann2

1European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
2Department of Biochemistry & Molecular Biology, University College London, Darwin Bldg., Gower St., London WC1E 6BT, UK.







In Figure 1, the structural families have the identification numbers used in the Structural Classification of Proteins version 1.48 database.
The family names that belong with these identification numbers can be downloaded here.

The interactions between protein domains determined by us are used as part of the PartsList Database at Yale.

1. Domain interactions in PDB

According to the three-dimensional criteria specified by us, the following SCOP domains were deemed to interact:
between chains
within chains
between chains - identical chains (oligomers)
between chains - nonidentical chains belonging to the same superfamily

Some domains are adjacent but do not interact in three dimensions.

File format: sequence id1|    superfamily/ies|    sequence id2| superfamily/ies

2. Family interactions in PDB

The files in 1. above can be parsed for superfamilies only.
The file format is: superfamily1| all the superfamilies it is seen to interact with.

Superfamilies that interact between chains.
Superfamilies that interact within chains.
Superfamilies that are adjacent but do not interact.
 

3. Genes that interact in Yeast

a) Interaction files:

From MIPS in February 2000:
genetic_interact.txt
physical_interact.txt

From Ito et al., 2000: http://www.pnas.org/cgi/content/full/97/3/1143/DC1

A plain list of interacting protein pairs as published in Uetz et al., Nature 403: 623-627 (2000).

Our final nonredundant interaction file with consistent gene names.

b) Structural assignments to yeast proteins:

This is in the following format:
(sequence id)-(sequence length) (sequence region) (scop sequence name)-(sequence length) (sequence region) (expectation value with which found)
The file is pdbisl_merge_assign_wo_error
 

c) Mch-format files for interactions between genes and their superfamily assignments

Format: sequence id1|    superfamily/ies|    sequence id2| superfamily/ies

Intramolecular pairs: within_orf.mch

Intermolecular pairs:
Pairs of sequences completely matched by single domains
Pairs of sequences completely matched where one sequence has more than one domain
Pairs of sequences where one is completely matched and one has no match - single domain proteins or multidomain proteins

4. Family interactions in Yeast

See format description in 2. above.

Intramolecular: within_orf.sfvers

Intermolecular: one_domain_complete_pairwise_100.sfvers