next up previous
Next: Bibliography Up: exam Previous: Key words

How many cytochromes?

Cytochromes are membrane proteins that contain a heme prosthetic group similar to that in hemoglobin or myoglobin. Therefore they have a large number of resonance forms, which can be made visible through their elaborate light absorption spectra. Differences in heme structure result in differences in absorption spectra as well as reduction potential, or tendency to accept an electron. Because of their role in cell energetics, it is important to determine the number of different cytochromes in a living cell, and their respective absorption spectra. The usual approach is to extract them from the membranes, purify them through e.g. electrophoresis and determine their spectra in solution. This procedure might strongly change the cytochromes. Below, we will describe how partial modelling can be used to unravel the different spectra of each cytochrome, from a table of extinction coefficients of an unknown mixture of membrane bound cytochromes for different electric potentials and wave lengths. The modelling makes use of the fact that the reduced form of cytochomes absorb light several orders of magnitude better than the oxidized form. The model is then the Nernst equation for the potential with respect to the standard hydrogen electrode, , as a function of the ratio of the concentration of oxidized cytochrome, , and the reduced one, :
(14.1)

where denotes the midpoint potential of the redox couple at pH 7.0 (in mV), the gas constant (8.314JKmol), $T$ the absolute temperature and the Faraday constant (96.494 J mVmol). So, at 298K, = 25.68mV. Obviously, we have that , which is constant, so . Rearrangement of (14.1) gives
(14.2)

The absorption at a certain wave length , at a potential , is assumed to be just the sum of the separated absorptions of the different cytochromes, plus and independent error in measurement. The expected extinction coefficients is thus
(14.3)
(14.4)

The extinction coefficients to be measured are taken to be
(14.5)

where the measurement errors 's are assumed to be independently normally distributed with a common variance . We will assume that there exists different cytochromes, where is a number chosen through a procedure still to be described. If we collect the coefficients in matrices, through , , and , the expected extinction coefficients can be compactly written as
(14.6)

Through the introduction of a free parameter for the extinction of each cytochrome at a specified wave length, we do not assume any functional form for the absorption spectra. We buy this flexibility with a significant amount of parameters. For a table of measurements (i.e. extinction coefficients with wavelengths and potentials, we have parameters (i.e. parameters , parameters for and ). We can only hope to estimate all these parameters if and if the range of potentials covers the range of sufficiently different midpoint potentials to some extend. We estimate the parameter values from the measurements on the basis of the maximum likelihood criterion. So, have to maximize the ln likelihood
(14.7)

as function of the listed arguments. The values for the and for which this maximum of is reached, called , and are the sought parameter values. We obtain them by solving
(14.8)
     
(14.9)
     
(14.10)

The caps on and indicate that , and must be substituted in the defining equations (14.3) and (14.4). Equations (14.8) and (14.9) are also obtained using the least squares criterion for estimating the parameters. Because of (14.5), this model can be classified as a non-linear regression one. The solution of (14.8) is
(14.11)

The solution of (14.10) is
(14.12)

The solution of (14.9) is less easy to obtain. The leading factor can be omitted, of course, but that is all we can do simplifying (14.9). We have to solve it numerically. We define
(14.13)

where we substitute (14.11) for the values (which also occur in ). We then find a solution for , through the Newton Raphson procedure, for
(14.14)

where the sequence of vectors converge to the sought vector after an appropriate choice for . The expression for the derivative of with respect to , denoted by in (14.14), is extremely massive. This is one obvious place where a numerical evaluation makes life bearable. So we take


for some small chosen value for $d$. Note that for each iteration in (14.14), we have to calculate in (14.11) times to obtain and . The size of required computer memory and time depends on , and . Because we were able to get explicit expressions for most parameters, i.e. , only parameters have to be obtained numerically. In practice this means that, provided that is not too large, the calculations do not give rise to serious problems. We now discuss the way to determine . When we choose we introduce rapidly more parameters, which results in an increasingly better fit, irrespective of the real number of cytochromes. This is reflected in the value of the ln likelihood function in the point of the maximum likelihood estimates, which is given by
(14.15)

where the index is attached to indicate that depends on . In order to decide on the value for , we study the increase in fit through the likelihood ratio statistic
 
  (14.16)

Here, again, the index is attached to to indicate that it depends on . Application of the likelihood ratio theory learns that the proper value for is found, when, for the first time for increasing , is not unlikely to represent a random trial from a density with parameter . This is decided when is less than the upper $\alpha$-quantile for the chi-square density with parameter , at probability of an error of the first kind of $\alpha$. The strict application of the likelihood ratio theory is a bit problematic in this case, because the number of parameters is increasing with the number of wave lengths. It does not increase with the number of potentials, however. After having determined , this way, we can test the model through the residuals , which should represent random (independent) trials form a normal density. If the model fails the test, we could try to improve it by e.g. assuming that the error of measurement is proportial to the mean. We then arrive at a bit more complicated likelihood function, but no new estimation problems arise.

Figure: The estimated absorption spectra of the cytochromes of E. coli, assuming that it is a mixture of 1, 2, 3, 4 or 5 different cytochromes. The midpoint potentials are given.

Figure: The plot of the supremum of the ln likelihood function, as a function of the number of different cytochromes. We should decide that there are 4 different cytochromes at .

Figures 14.2 and 14.1 illustrate the application of the presented theory for Escherichia coli-data from [#!Wiel86!#]. This bacterium appears to posses 4 cytochromes.
next up previous
Next: Bibliography Up: exam Previous: Key words
Theoretische Biologie 2002-05-01