![]() |
|
Equilibrium
sedimentation data analysis software.
Dmitry B. Veprintsev *, Nicholas W. Foster* and Alan R. Fersht, FRS *to
whom correspondence should be addressed
|
Data is fitted to the appropriate models
using the Marquardt algorithm (1). Multiple data sets are combined head-to-tail
and indexed (by an additional x-value) and the fitting function used corresponding
to global and local variables for the given dataset(s) (global analysis
(2)). The errors reported (1s)
are not the real errors but rather a statistical estimation of the goodness
of fit
These values should be treated as an underestimate
of the actual errors.
Fitting Models.
The general model for fitting data is as follows:
A(r)=Abaseline+Ab1exp(M1b1)+ Ab2exp(M2ß2)+ .........+Abiexp(Mißi)
ßi=w2(r2-rb2)(1-Vpir)/(2RT)
Abi=e(l)lCi
for absorbance data or
Fr (fringe displacement)=klCi,
where
k =3.33*MolecularWeight/(1.2).
3.33 is the accepted value for proteins
at 675 nm and 12 mm (1.2 cm) path (3.33 fringes/(mg/ml)) in the case of
interference optics.
M is molecular mass of the monomer, Ab and rb are absorbance and radius at the bottom, Abaseline is the absorbance of the buffer, w is angular rotational speed, l is path length, R is gas constant, T is absolute temperature, Vp is partial specific volume of the protein and r is solvent density. M is global variable and Abaseline and Ab are local for each dataset. This allows mixing of the datasets obtained in different sectors/speeds/wavelengths/concentrations as long as they share same M and same buffer conditions.
The single-species model can be used for
estimating Vp since M is often known from the amino acid sequence.
(1-Vpr)M=const
for the given fit. Correspondingly,
Vpfitted=(1/r)(1-(Mfitted/Mini)((1-Vpinir))
The Ci is concentration of the
ith species at the bottom. The concentrations of different species
(Ci) are linked by the association equilibrium (if any).
This is the analysis of the hetero-association
model:
nA +B <-K-> AnB
K=[AnB]/([A]n[B])
[A]+n[AnB]=A0
[B]+[AnB]=B0
lg10([AnB])=lg10(K)+ nlg10([A]) + lg10([B])
The concentrations of the individual species go into the general model
[A] and [B] are the concentrations of monomers at the bottom, [AnB] is the concentration of the complex. K is the association constant. The logarithms of the association constant and concentrations are chosen as parameters for numerical reasons: this will make them of the same scale. The initial guesses for [A] and [B] are calculated by numerically solving full equation system above.
Multi Data Fitting.
Ultraspin has the advantage of being able
to fit data either individually or global fits of all data similtaneously.
It also allows global fitting of data collected at different speeds,
concentrations or wavelengths. The following table represents the results
obtained for the analysis of the SAM domain of p73 using a single species
model (4). The results of the fit of individual datasets and the global
fit of all of them simultaneously are compared.
| Curve # | Mass (Da) | Error (+/-) |
| 2 | 9045 | 135 |
| 3 | 8528 | 133 |
| 4 | 9193 | 106 |
| 5 | 9015 | 156 |
| 6 | 8924 | 112 |
| 7 | 8867 | 121 |
| Average | 8884 | |
| Multi Fit | 8924 | 49 |
The expected molecular mass for the SAM domain of p73 being 9017 Da.
If fitting a single dataset, all parameters can be set on the fit window. When fitting multiple datasets local parameters are set individually in the Absorbance spectrum and the Select Multiple windows.
Global
Parameters.
Terminology
|
These models are for a one-specie fit of
data - i.e., only monomers are present.
The first four models (without _Mult
in the name) fit only single dataset at a time. The latter two (Spec1_Mw_varshift_Mult
and Spec1_V_varshift_Mult) can fit
multiple datasets simultaneously.
The models Onespecie_V_fixedshift,
Spec1_V_varshiftand
Spec1_V_varshift_Mult
use Vbar (partial specific volume) as a fitting parameter. In reality,
there is no need for these models as (1-pV)M=const for the monomeric models
and the Vbar fitted can be easily calculated from the M fitted for M initial
and is reported in the textbox Vbar(Mfixed).
The only global parameter is M, so it is possible to mix datasets with different concentrations, speed or wavelength.
global fit parameters: M;
local fit parameters: baseline, Absorbance
at the bottom (Abot) - for single dataset models specify on
the fit form, for multiple - on the multiple dataset selection form.
global non-variable parameters to be specified:
p, Vbar(A)
local non-variable parameters to be specified:
Rbottom
2 A<-K1-> A2
The global parameters are M(A) and lgK1. Local parameters are baseline
and Abottom. The concentration of the protein at the bottom of the cell
is calculated from the from the Abottom.
This model can fit multiple datasets as long as they share common M
and K, so different cells, speeds, concentrations are OK.
Ext COeff shell be specified for monomeric specie.
Dimerisation_K_varshift_Mult
The same model, but M is not varied. The global parameter is lgK1.
Local parameters are baseline and Abottom.
This model can fit multiple datasets as long as they share common
K, so different cells, speeds, concentrations are OK.
Ext COeff shell be specified for monomeric specie.
Dimerisation_K_varshift_Mult_Specrum
Same model, but global parameters are lgK1 and C1 - concentration of
the monomer at the botton of the cell. Initially, C1 is calculated based
on the Abottom that should be specified in Abot editBox on the main FitForm
using ext. coeff. specified and initial value of lgK1.
Local parameter is baseline.
This function will fit multiwavelength data for the same cell, measured
at the same speed, concentration etc.
It will not work for datasets from different cells, speeds etc.
A <-K1-> A2 <-K2->A4
Global parameters are lgK1, lgK2 and lgC1 - concentration of the monomer
at the botton of the cell. Initially, C1 is calculated based on the Abottom
that should be specified in Abot editBox on the main FitForm using ext.
coeff. specified and initial value of lgK1 and lgK2. Local parameter is
baseline.
This function will fit multiwavelength data for the same cell, measured
at the same speed, concentration etc.
It will not work for datasets from different cells, speeds etc.
Protein_DNA_NA_plus_B_AB_K1_varshift_Mult_Spectrum
Hetero- oligomerisation.
nA +B <-K-> AnB
global fit parameters: lg10(K); lg10(Ca_button);
lg10(Cb_button);
local fit parameters: baseline
global non-variable parameters to be specified:
p, Vbar(A); Vbar(B); n; Ma, Mb; AbsSpectrum_A, AbsSpectrum_B, Ext(A), Ext(B).
local non-variable parameters to be specified:
Rbottom
lgC1 and lgC2 - concentrationss of monomers of A and B at the bottom are calculated from the Abot and Abot2 (total absorbances of A and B at the bottom) and Ext. coeff and Ext coeff2 and lgK1 given initially.
This function will fit multiwavelength data for the same cell, measured
at the same speed, concentration etc.
It will not work for datasets from different cells, speeds etc.
Homooligomerization
The only global parameter is lgK, so it
is possible to mix data with
different concentrations, cells, speeds,
wavelengths etc.
n A <K> An
Non-variable parameters
n - number of monomers in the complex
(2 for dimer, etc)
ext coeff (A)
M(A)
Vbar(A)
solvent density
Fit parameters
global:
lgK
local:
baseline
Abottom
Model-free: Non-interacting species (2
and 3 components).
This model can be used with interacting
systems as well. The global parameters are M1, M2, M3 (for 3 components
fit). The local parameters are baseline, and initial absorbencies at the
bottom A1b, A2b and A3b. Vbar1, Vbar2 and Vbar3 should also be specified.
global fit parameters: M1, M2, M3;
local fit parameters: baseline, A1_b, A2_b,
A3_b;
global non-variable parameters to be specified:
p, Vbar(M1); Vbar(M2); Vbar(M3);;
local non-variable parameters to be specified:
Rbottom
These models are for describing the situation
when you have hetero-oligomerisation in the first step, and this complex
oligomerizes itself.
nA_plus_B_K1_AnB_K2_AnB_m_fixedshift_Mult_Spectrum
the most general model.
nA+B <K1>AnB <K2> (AnB)m
The difference is that the global parameters
are lgK1, lgK2=lgA and lgC1 and lgC2. This means that if the multiple
datasets are used, they should correspond to the same cell - multiwavelength
data only.
A and AnB<K2>(AnB)m
lgC1=lg(A)
lgC2=lg[Anb]
lg[(AnB)m]=lgK2+m*lgC2
global fit:
lgK2
lgC1
lgC2
(2) Johnson, M. L. and Frasier, S. G. (1985). Nonlinear least-squares analysis. Methods in Enzymology 117, 301-342.
(3)Poget. S.F., Legge, D.B., Proctor, M.R., Butler, P.J.G., Bycroft, M., Williams, R.L. (1999) J. Mol. Biol. 290, 867-879
(4) Wang, W.K., Bycroft, M, Foster, N.W., Buckle, A.M., Fersht, A.R., Chen, Y.W (2000) Crystal structure of the C-terminl sterile a motif (SAM) domain of human p73a does not show homotypic interaction (In preparation)
© MRC (UK) Centre for Protein Engineering 2000