(taken from Comparative ("Homology")
Modeling for Beginners with Free
Software by Eric Martz)
Suppose
you want to know the 3D structure of a target protein that
has not been solved
empirically by X-ray crystallography or NMR. You have only the
sequence. If an empirically determined 3D structure is available for a
sufficiently similar protein (50% or better sequence identity would be good),
you can use software that arranges the backbone of your sequence identically to
this template. This is called "comparative
modeling" or "homology modeling". It is, at best, moderately
accurate for the positions of alpha carbons in the 3D structure, in regions
where the sequence identity is high. It is inaccurate for the details of
side chain positions, and for inserted loops with no matching sequence in the
solved structure.
A comparative modeling routine needs
three items of input:
1. The sequence of the protein with unknown
3D structure, the "target sequence".
2. A 3D template
is chosen by virtue of having the highest sequence identity with the target
sequence. The 3D structure of the template must be determined by reliable
empirical methods such as crystallography or NMR, and is typically a published
atomic coordinate "PDB" file from the Protein Data Bank.
3. An alignment
between the target sequence and the template sequence.
First, the comparative modeling routine
arranges the backbone identically to that of the template. This means that not
only the positions of alpha carbons, but also the phi and psi angles and
secondary structure, are made identical to the template. Next, the more
sophisticated comparative modeling packages adjust side chain positions to
minimize collisions, and may offer further energy minimization or molecular
dynamics in an attempt to improve the model.
(taken from Comparative ("Homology")
Modeling for Beginners with Free
Software by Eric Martz)
Successful
predictions based on comparative models have been reviewed by Baker and Sali (2001).
The following is summarized from their review, where references to specific
cases may be found.
a)The
positions of conserved regions of the protein surface can help to
identify putative active sites and binding pockets.
b) If the
ligand is known to be charged, the binding site may be predicted by searching
the surface for a cluster of complementary charges.
c) The
size of a ligand may be predicted from the volume of the putative binding
pocket. In one case, relative affinities of a series of ligands have been
predicted.
d) Such
predicions are useful to guide mutagenesis experiments.
One more
trivial application is the generation of nice figures to illustrate specific
points. This is what we are going to do today.
Remember,
a structural model can be useful to guide experimentation, but it does not
substitute it!!!
More or less in-depth
tutorials can be found at:
http://www.cmbi.kun.nl/gvteach/hommod/index.shtml
http://www.umass.edu/microbio/chime/explorer/homolmod.htm
There is a variety of
software that you can install locally to do homology modelling. You really need
to know what you are doing if you want to do soÉ alternatively, you can use one of the several homology modelling
servers available online, which do much of the thinking, knowledge, judgments,
etc. for you.
á
ESyPred3D Web Server 1.0: http://www.fundp.ac.be/urbm/bioinfo/esypred/
á
What if: http://swift.cmbi.kun.nl/swift/servers/modserver-submit.html
(you will be using this server today)
Ideally you will
install Deep View (previously known as Swiss PDB viewer), a free modelling
program/ structure editor which interfaces directly with swiss model. The current
version for Mac OS X cannot interface directly with swiss model, and is not
available today. If you work with a different platform, or want to explore Deep
View anyway, good tutorials on its use are available at:
http://www.usm.maine.edu/~rhodes/SPVTut/index.html
The
following servers appear to be less developed than the ones above.
á
"SDSC1" - SDSC Protein Structure Homology Modeling Server
-http://cl.sdsc.edu/hm.html
(taken
from the online manual ÒPrinciples
of Protein Structure, Comparative Protein Modelling and VisualisationÓ by
Nicolas Guex and Manuel C. Peitsch at http://swissmodel.expasy.org/course/course-index.htm)
The
SWISS-MODEL server is reachable on the World Wide Web (WWW) and requests can be
submitted through easy to fill forms. Since protein modelling is heavily
dependent on the alignment between target and template sequences, SWISS-MODEL
provides four distinct modes of function accessible through separate forms:
1. The
First Approach Mode: This mode allows the user
to submit a sequence or its
Swiss-Prot identification code. In this
mode, SWISS-MODEL will go through the complete procedure described above. The First Approach Mode
also allows the user to define a
choice of pre-selected template structures, thereby overruling the automated selection procedure. The results of
the modelling procedure will be
returned to the user via E-mail.
2. The Optimise Mode
allows the user to re-compute a model
by submitting altered sequence alignments and ProMod command files. The
sequence alignment procedure, which is fully automatic in the First Approach
Mode, may yield sub-optimal alignments and consequently lead to erroneous models.
The automated alignment of
moderately similar sequences is indeed
often imprecise and the boundaries of non-conserved loops are frequently ill defined and
miss-aligned. These regions of the
sequence alignments must therefore be corrected by hand in order to overcome these weaknesses. The
Optimise Mode allows the user to do such corrections, and to request the remodelling
of the sequence by submitting his
own sequence alignment. This is best
done by preparing a modelling request within the Swiss-PdbViewer (see chapter 12). These requests can
then be saved as "HTML" files
and then submitted through a Web browser.
3. The Combine Mode
allows the user to combine independently modelled protein chains in a
quaternary complex, based on an
existing assembly template. In the current version of the server, This
assembly template must be a PDB file containing the desired protein complex. The server provides
detailed guidelines on how to
submit requests to this particular facility. The actions taken by the
server include a superposition of each modelled chain onto its respective counterpart on the assembly
template and an energy
minimisation of the whole complex.
4. The GPCR Mode.
To get a GPCR sequence modelled, the first step consists of selecting a
template for comparative modelling from the list of available ones. These have
been created a priori
using the methodology described in chapter 7. Then the user types (or uses a "cuts and
pastes" function) to introduce the
sequences of the seven trans-membrane helices, along with an E-mail address, a contact name, and a
title for the request. The SWISS-GPCR
modelling pages provide detailed help in Hypertext format, links to other sites
relevant to seven trans-membrane
receptors, and a demonstration version of a filled in form. Furthermore, the user may query a special index of
all GPCR sequences in the
SWISS-PROT database, and search their
sequences using a BLAST interface (Altschul et al,
1990). This allows GPCR sequences
to be searched rapidly, and will also
tell the user which of the available templates is the most appropriate
one on which to model the query sequence. The current version of the SWISS-MODEL server does not allow the user to
create a new template using his own experimental data. This can however be achieved upon direct contact
with Pawel Herzyk at York.
The SWISS-MODEL Web interface provides
help and guidelines to the use of the difference server modes.
In this
simple tutorial you are going to use homology modeling to create a cartoon that
highlights/illustrates a structural peculiarity in a protein. This is the
easiest protocol you have ever followed!
For this
tutorial you will use the sequence of Rab27, a small GTPase from the Rab
family, which displays a peculiar sequence insertion on the most N-terminal
third of the sequence, when compared to other small GTPases of the Rab family

You can
get more information on the protein from: http://us.expasy.org/cgi-bin/niceprot.pl?P51159 (you donÕt have to!)
Just copy
the sequence below, in FASTA format, to the clipboard (select the sequence and
hit the copy button on your browser)
>sp|Q9ERI2|RB27A_MOUSE
Ras-related protein Rab-27A - Mus musculus (Mouse).
MSDGDYDYLIKFLALGDSGVGKTSVLYQYTDGKFNSKFITTVGIDFREKRVVYRANGPDG
AVGRGQRIHLQLWDTAGQERFRSLTTAFFRDAMGFLLLFDLTNEQSFLNVRNWISQLQMH
AYCENPDIVLCGNKSDLEDQRAVKEEEARELAEKYGIPYFETSAANGTNISHAIEMLLDL
IMKRMERCVDKSWIPEGVVRSNGHTSADQLSEEKEKGLCGC
Point your
web browser to: http://swissmodel.expasy.org//SWISS-MODEL.html
(In this page you have
a variety of links, namely to tutorials, software and FAQS)

Follow the
link First Approach mode (this is what you do when you have a
sequence that you want to model Ð once you have the model, you can do more
elaborate things)

Fill out
the form on the top of the page with your e-mail address, as your results will
be e-mailed back to you. Avoid naming this job Rab27 as everybody will likely
do the same and this may create problems with the server.
Now paste
the Rab27 sequence into the window below ÒProvide a sequence ÉÓ.
You are
going to use the default parameters, so just scroll down the page, noticing
that you have the option to:
a) Define
the BLAST threshold used in the search for templates
b) Select
specific templates from the database (a processed version of PDB).
c) Supply
your own templates (i.e. not published structures)
You need
to select the output format you want. Select Swiss PDB
viewer mode Ð this is the output format with the most information.
Note that
you have the option of sending request to secondary structure prediction and
fold recognition servers. This would be very handy if there was not known
suitable template for homology modeling, i.e. this exercised failed. Today you
will not need to do this.
Now scroll
up again and É. Normally you would press the Òsend
requestÓ button to submit your modeling request to the server, and then wait
for the results to be e-mailed to you.
You would
then see a window like this.

Since you
donÕt have access to e-mail here, you can just collect the output files
following the links below.
You will
receive by e-mail one PDB file corresponding to the modeled Rab27 coordinates,
which you can open in any structure viewer. The answer to your ÒquestionÓ about
the Rab27-specific insertion can be found in the PDB file itself, which
includes an alignment of the query sequence to the templates used, indicating
the secondary structural elements.

You will
receive one PDB file containing the coordinates of the model + the coordinates
of all templates used.
This file
is here:
http://www.mrc-lmb.cam.ac.uk/rlw/text/homology_files/ AAAa0F61z.pdb
and also a
Log file, which is here:
http://www.mrc-lmb.cam.ac.uk/rlw/text/homology_files/TraceLog.rtf
If you
open it in Pymol, and after some basic manipulations, you can show/highlight
the position of the insert in your modeled structure, in comparison to other
related structures, which in the cartoon below are shown in different colors.
The insert is highlighted in blue:

Without having DeepView
(Swiss PDB viewer) installed, that is it!
If you install DeepView, you can proceed to the optimization mode, which
is simply a way of you correcting manually the sequence alignments (which is usually
necessary!!). You can also assemble your modelled proteins into oligomeric
proteins, as long as you have a protein complex of known structure that you can
use as a template.