(taken from Comparative ("Homology") Modeling for Beginners with Free Software by Eric Martz)
Suppose you want to know the 3D structure of a target protein that has not been solved empirically by X-ray crystallography or NMR. You have only the sequence. If an empirically determined 3D structure is available for a sufficiently similar protein (50% or better sequence identity would be good), you can use software that arranges the backbone of your sequence identically to this template. This is called "comparative modeling" or "homology modeling". It is, at best, moderately accurate for the positions of alpha carbons in the 3D structure, in regions where the sequence identity is high. It is inaccurate for the details of side chain positions, and for inserted loops with no matching sequence in the solved structure.
A comparative modeling routine needs three items of input:
1. The sequence of the protein with unknown 3D structure, the "target sequence".
2. A 3D template is chosen by virtue of having the highest sequence identity with the target sequence. The 3D structure of the template must be determined by reliable empirical methods such as crystallography or NMR, and is typically a published atomic coordinate "PDB" file from the Protein Data Bank.
3. An alignment between the target sequence and the template sequence.
First, the comparative modeling routine arranges the backbone identically to that of the template. This means that not only the positions of alpha carbons, but also the phi and psi angles and secondary structure, are made identical to the template. Next, the more sophisticated comparative modeling packages adjust side chain positions to minimize collisions, and may offer further energy minimization or molecular dynamics in an attempt to improve the model.
(taken from Comparative ("Homology") Modeling for Beginners with Free Software by Eric Martz)
Successful predictions based on comparative models have been reviewed by Baker and Sali (2001). The following is summarized from their review, where references to specific cases may be found.
a)The positions of conserved regions of the protein surface can help to identify putative active sites and binding pockets.
b) If the ligand is known to be charged, the binding site may be predicted by searching the surface for a cluster of complementary charges.
c) The size of a ligand may be predicted from the volume of the putative binding pocket. In one case, relative affinities of a series of ligands have been predicted.
d) Such predicions are useful to guide mutagenesis experiments.
One more trivial application is the generation of nice figures to illustrate specific points. This is what we are going to do today.
Remember, a structural model can be useful to guide experimentation, but it does not substitute it!!!
More or less in-depth tutorials can be found at:
There is a variety of software that you can install locally to do homology modelling. You really need to know what you are doing if you want to do so… alternatively, you can use one of the several homology modelling servers available online, which do much of the thinking, knowledge, judgments, etc. for you.
Š ESyPred3D Web Server 1.0: http://www.fundp.ac.be/urbm/bioinfo/esypred/
Š What if: http://swift.cmbi.kun.nl/swift/servers/modserver-submit.html
(you will be using this server today)
Ideally you will install Deep View (previously known as Swiss PDB viewer), a free modelling program/ structure editor which interfaces directly with swiss model. The current version for Mac OS X cannot interface directly with swiss model, and is not available today. If you work with a different platform, or want to explore Deep View anyway, good tutorials on its use are available at:
The following servers appear to be less developed than the ones above.
Š "SDSC1" - SDSC Protein Structure Homology Modeling Server -http://cl.sdsc.edu/hm.html
(taken from the online manual “Principles of Protein Structure, Comparative Protein Modelling and Visualisation” by Nicolas Guex and Manuel C. Peitsch at http://swissmodel.expasy.org/course/course-index.htm)
The SWISS-MODEL server is reachable on the World Wide Web (WWW) and requests can be submitted through easy to fill forms. Since protein modelling is heavily dependent on the alignment between target and template sequences, SWISS-MODEL provides four distinct modes of function accessible through separate forms:
1. The First Approach Mode: This mode allows the user to submit a sequence or its Swiss-Prot identification code. In this mode, SWISS-MODEL will go through the complete procedure described above. The First Approach Mode also allows the user to define a choice of pre-selected template structures, thereby overruling the automated selection procedure. The results of the modelling procedure will be returned to the user via E-mail.
2. The Optimise Mode allows the user to re-compute a model by submitting altered sequence alignments and ProMod command files. The sequence alignment procedure, which is fully automatic in the First Approach Mode, may yield sub-optimal alignments and consequently lead to erroneous models. The automated alignment of moderately similar sequences is indeed often imprecise and the boundaries of non-conserved loops are frequently ill defined and miss-aligned. These regions of the sequence alignments must therefore be corrected by hand in order to overcome these weaknesses. The Optimise Mode allows the user to do such corrections, and to request the remodelling of the sequence by submitting his own sequence alignment. This is best done by preparing a modelling request within the Swiss-PdbViewer (see chapter 12). These requests can then be saved as "HTML" files and then submitted through a Web browser.
3. The Combine Mode allows the user to combine independently modelled protein chains in a quaternary complex, based on an existing assembly template. In the current version of the server, This assembly template must be a PDB file containing the desired protein complex. The server provides detailed guidelines on how to submit requests to this particular facility. The actions taken by the server include a superposition of each modelled chain onto its respective counterpart on the assembly template and an energy minimisation of the whole complex.
4. The GPCR Mode. To get a GPCR sequence modelled, the first step consists of selecting a template for comparative modelling from the list of available ones. These have been created a priori using the methodology described in chapter 7. Then the user types (or uses a "cuts and pastes" function) to introduce the sequences of the seven trans-membrane helices, along with an E-mail address, a contact name, and a title for the request. The SWISS-GPCR modelling pages provide detailed help in Hypertext format, links to other sites relevant to seven trans-membrane receptors, and a demonstration version of a filled in form. Furthermore, the user may query a special index of all GPCR sequences in the SWISS-PROT database, and search their sequences using a BLAST interface (Altschul et al, 1990). This allows GPCR sequences to be searched rapidly, and will also tell the user which of the available templates is the most appropriate one on which to model the query sequence. The current version of the SWISS-MODEL server does not allow the user to create a new template using his own experimental data. This can however be achieved upon direct contact with Pawel Herzyk at York.
The SWISS-MODEL Web interface provides help and guidelines to the use of the difference server modes.
In this simple tutorial you are going to use homology modeling to create a cartoon that highlights/illustrates a structural peculiarity in a protein. This is the easiest protocol you have ever followed!
For this tutorial you will use the sequence of Rab27, a small GTPase from the Rab family, which displays a peculiar sequence insertion on the most N-terminal third of the sequence, when compared to other small GTPases of the Rab family
You can get more information on the protein from: http://us.expasy.org/cgi-bin/niceprot.pl?P51159 (you don’t have to!)
Just copy the sequence below, in FASTA format, to the clipboard (select the sequence and hit the copy button on your browser)
>sp|Q9ERI2|RB27A_MOUSE Ras-related protein Rab-27A - Mus musculus (Mouse).
Point your web browser to: http://swissmodel.expasy.org//SWISS-MODEL.html
(In this page you have a variety of links, namely to tutorials, software and FAQS)
Follow the link First Approach mode (this is what you do when you have a sequence that you want to model – once you have the model, you can do more elaborate things)
Fill out the form on the top of the page with your e-mail address, as your results will be e-mailed back to you. Avoid naming this job Rab27 as everybody will likely do the same and this may create problems with the server.
Now paste the Rab27 sequence into the window below “Provide a sequence …”.
You are going to use the default parameters, so just scroll down the page, noticing that you have the option to:
a) Define the BLAST threshold used in the search for templates
b) Select specific templates from the database (a processed version of PDB).
c) Supply your own templates (i.e. not published structures)
You need to select the output format you want. Select Swiss PDB viewer mode – this is the output format with the most information.
Note that you have the option of sending request to secondary structure prediction and fold recognition servers. This would be very handy if there was not known suitable template for homology modeling, i.e. this exercised failed. Today you will not need to do this.
Now scroll up again and …. Normally you would press the “send request” button to submit your modeling request to the server, and then wait for the results to be e-mailed to you.
You would then see a window like this.
Since you don’t have access to e-mail here, you can just collect the output files following the links below.
You will receive by e-mail one PDB file corresponding to the modeled Rab27 coordinates, which you can open in any structure viewer. The answer to your “question” about the Rab27-specific insertion can be found in the PDB file itself, which includes an alignment of the query sequence to the templates used, indicating the secondary structural elements.
You will receive one PDB file containing the coordinates of the model + the coordinates of all templates used.
This file is here:
and also a Log file, which is here:
If you open it in Pymol, and after some basic manipulations, you can show/highlight the position of the insert in your modeled structure, in comparison to other related structures, which in the cartoon below are shown in different colors. The insert is highlighted in blue:
Without having DeepView (Swiss PDB viewer) installed, that is it! If you install DeepView, you can proceed to the optimization mode, which is simply a way of you correcting manually the sequence alignments (which is usually necessary!!). You can also assemble your modelled proteins into oligomeric proteins, as long as you have a protein complex of known structure that you can use as a template.