Jong Park1, Michael Lappe1 and Sarah A. Teichmann2
1European Bioinformatics Institute, Wellcome
Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
2Department of Biochemistry & Molecular
Biology, University College London, Darwin Bldg., Gower St., London WC1E
6BT, UK.
In Figure 1, the structural families have the identification numbers
used in the Structural Classification
of Proteins version 1.48 database.
The family names that belong with these identification numbers can
be downloaded
here.
The interactions between protein domains determined by us are used as part of the PartsList Database at Yale.
1. Domain interactions in PDB
According to the three-dimensional criteria specified by us, the following
SCOP domains were deemed to interact:
between
chains
within
chains
between
chains - identical chains (oligomers)
between
chains - nonidentical chains belonging to the same superfamily
Some domains are adjacent but do not interact in three dimensions.
File format: sequence id1| superfamily/ies| sequence id2| superfamily/ies
2. Family interactions in PDB
The files in 1. above can be parsed for superfamilies only.
The file format is: superfamily1| all the superfamilies it is seen
to interact with.
Superfamilies that interact between
chains.
Superfamilies that interact
within chains.
Superfamilies that are adjacent
but do not interact.
3. Genes that interact in Yeast
a) Interaction files:
From MIPS
in February 2000:
genetic_interact.txt
physical_interact.txt
From Ito et al., 2000: http://www.pnas.org/cgi/content/full/97/3/1143/DC1
A plain list of interacting protein pairs as published in Uetz et al., Nature 403: 623-627 (2000).
Our final nonredundant interaction file with consistent gene names.
b) Structural assignments to yeast proteins:
This is in the following format:
(sequence id)-(sequence length) (sequence region) (scop sequence name)-(sequence
length) (sequence region) (expectation value with which found)
The file is
pdbisl_merge_assign_wo_error
c) Mch-format files for interactions between genes and their superfamily assignments
Format: sequence id1| superfamily/ies| sequence id2| superfamily/ies
Intramolecular pairs: within_orf.mch
Intermolecular pairs:
Pairs of sequences
completely matched by single domains
Pairs of sequences completely
matched where one sequence has more than one domain
Pairs of sequences where one is completely matched and one has no match
- single
domain proteins or multidomain
proteins
4. Family interactions in Yeast
See format description in 2. above.
Intramolecular: within_orf.sfvers
Intermolecular: one_domain_complete_pairwise_100.sfvers