Analysis of CaH O hydrogen bonding in Protein crystal structures
M. Madan Babu, Center for Biotechnology, Anna University
Introduction:
Protein tertiary structures are stabilized by numerous interactions which include electrostatic interactions, hydrophobic effect, salt bridges, hydrogen bonding and disulfide bonds among others. Of these interactions, hydrogen bonding plays a major role in determining the stability of the tertiary structure of a protein. Many types of hydrogen bonds occur in proteins the most prominent among them being NH O=C where the hydrogen that is bonded to Nitrogen atom in protein residues is shared by a nearby oxygen atom which has a partially negative charge. This type of hydrogen bonds greatly influences the formation of secondary structures in proteins like the a helices and b strands. In the case of the a helix a repeating pattern of such bonds between the backbone C=O of residue i and NH of residue i+4 (both from the peptide backbone amide) forms a hydrogen bond. A repetition of this arrangement for successive residues results in the formation of a helix. The a helix can also be defined by backbone dihedral angles f and y which lie in the III quadrant of the Ramachandran map.
A different type of hydrogen bond also occurs in proteins apart from the NH O=C type but with less frequency. It also has a much lesser energetic contribution to protein stability. This is observed between backbone CaH and O=C of a nearby residue. The concept of CH...O hydrogen bonds has recently gained much interest, with a number of reports indicating the significance of these non-classical hydrogen bonds in stabilizing protein structures.
With a view to understand the conformational features of this type of hydrogen bond, a detailed analysis of a ultra high resolution protein crystal structures was undertaken.
Method :
Choice of dataset
The dataset of protein structures was made from coordinates deposited in the protein data bank (PDB June 2000 release). The resolution cutoff chosen was 1.2 to cull out any crystallographic artifacts that may have been inadvertently introduced during refinement/ model building. The dataset contained only those proteins which had less than 25% sequence similarity among them. This non-redundant dataset had 61 protein structures after resolution cutoff and sequence similarity criteria. In case of dimeric proteins, only those structures where the two chains had less than 25% homology among themselves and with other members of the dataset were chosen. For the analysis undertaken to monitor helix termination and the influence of the CHO type hydrogen bond on helix termination, the proteins having only b strands in its tertiary structure were not included. The dataset in its final form had 54 proteins and was used for helix termination/ CHO bond analysis.
Definition of Helix
The secondary structures were calculated using backbone fy criteria and those residues having f y values in the range f = -600 ± 600 and y = -400 ± 600 were classified to be in aR conformation. Four such consecutive residues in aR conformation constituted a single turn of a helix. The helix was continued until a residue was reached that did not have the above mentioned limits of backbone dihedrals. The residue that deviated away from the aR conformation causing termination of the helix was termed as the T residue and the succeeding residue as T+1.
conformation aR aR aR aR aR aR aR T T+1 T+2
Residue No. 1 2 3 4 5 6 7 8 9 10
In the above schematic, the stretch of first seven residues constitute a helix which is terminated by residue 8.
Backbone dihedral angles were calculated for all the proteins and 296 helices were identified in the dataset. Of these, four helices were at the termini of the protein and hence did not posses T+1 residue. These 292 helices were chosen for the succeeding analysis.
Characterization of the Helix dataset :
The length of the helices were calculated using Helidata, and the length distribution of the helices was obtained. The phi psi plot for the helix region alone was also plotted to characterize the helix dataset. The phi v/s psi for T and T+1 residues for the 292 helices were also plotted.


Periplasmic Molybdate Binding Protein (1atg)


Methyl Coenzyme M reductase (1mro)

Analysis :
To check for the presence of CHO hydrogen bond at the terminus of these helices a distance criteria of <4.0A0 between T+1 C=O and T-4 Ca was fixed and those helices that satisfied this criterion were seperated. 16 such examples were identified. Upon identification of these 16 helices, they were then used for subsequent studies.
Various distance parameters, and angle parameters were calculated by using programs that were written in Unix Shell Script. Insight was used to fix the Hydrogen atom on the Ca molecules on the selected 16 helices, and the distance of the CaH and the O=C was calculated by T41 and DistCalc.


The f y values for the T, and T+1 residues were obtained by the program Helidata, and this was plotted using Microsoft Excell. On calculation of the distances after fixing hydrogens, from the 16 helices, 5 helices were chosen for further analysis. These 5 helices were chosen with the following criterion:
1. The H O distance should be less than 2.4 Ao
2. The CHO angle should be greater than 100o
currently analysis of the same kind but to the whole protein molecule, instead of limiting to the helix region alone is being carried out.
some of the statistical data obtained during the analysis are given in the following pages
Helix termination statistics:
No. of Helices terminating in
Conformation of T+1 residue:
There are three cases where two consecutive aL conformations (T and T+1) occur after helix termination.
(1cex) Cutinase. 4 residue helix (aa: 152-156). T=R156
(55,37,174); T+1=G157 (81,11,-175).
(1mro b) Methyl coenzyme M reductase. 8 residue helix (aa:
197-204). T=L205(58,37,177); T+1=K206 (62,20,170)
(2pth) Peptidyl tRNA hydrolase. 10 residue helix (aa: 114-123)
T=G124(85,43,-179); T+1=N125 (55,47,-179).



References :