The Silicon Cell: computing the living cell

 

 

 

1. A website with running parts of silicon cells

managed by Jacky L. Snoep (Triple J Stellenbosch and Integrative Bioinformatics, Amsterdam)

2. An international research programme nucleated by:

- BioCentrum Amsterdam (SILS of the University of Amsterdam [Roel van Driel; www.science.uva.nl/research/sils/research/str/)] and IMBS of the Free University (vrije Universiteit) [Hans V. Westerhoff; http://www.bio.vu.nl/hwconf ; coordinator]

- CWI (Centrum for Mathematics and Informatics; Jan G. Verwer (www.cwi.nl/projects/by-name/lifesciences)

and the

-Institute for Informatics of the University of Amsterdam (Peter Sloot; www.science.uva.nl/research/scs/ )).

 The program has various components. Notably these include both experiments determining the behavior of molecules in the living cell and calculations of cell behavior on the basis of molecular data.

 

The calculation part:

a1. Aim

The long-term goal of the Silicon Cell (SiC) Consortium is the computation of Life at the cellular level on the basis of the complete genomic, transcriptomic, proteomic, metabolomic and cell-physiomic information that will become available in the forthcoming years. Completing this ambition will take more than a decade. This application concentrates on three major challenges, i.e. networks, space and time, and deals with systematic handling of the relevant data and results.

 

a2. Key objectives for the present period

(i) Computational models of catabolism, signal transduction, gene-expression regulation, coupling between supramolecular structures and fluxes, and biochemical cycling.

(ii) Model integration to calculate system properties for two real cells (E. coli and S. cerevisiae).

(iii) Demonstration of the cellular bioinformatics approach: calculating without fitting.

(iv) Methodology for modularisation to accurate mesoscopic descriptions.

(v) Visualisation, systematic data access and a www resource for two real living cells.

 

b. Approach

We shall focus on three different, but interconnected dimensions of cell functioning, i.e. (i) the 'chemical and information dimension': networks of biochemical reactions and their regulation, (ii) space: gradients and dynamic structures in signal transduction and gene expression (chromatin), and (iii) biological time: coherent glycolytic and cell-cycle oscillations. Our work will build on already available experimental and theoretical expertise. The specific cases connect to the glucose entry into S. cerevisiae en E. coli, subsequent carbon and energy metabolism, up to their coupling to examples of signal transduction, gene expression regulation and cell-cycle. This work will be coupled to biological experiments carried out (a) in the research school BioCentrum Amsterdam (not part of this application, and for which funding is already available), and (b) world-wide in collaborating groups and through the open literature.

 

c. Elements of innovation

Different from most traditional modelling methods, this programme will always start from real experimental data that stem from molecular biology, biochemistry, physics and chemistry. Rather than aiming at an understanding of principles of function (as would be done by theoretical biology or physics) we shall 'merely' compute the implications of the molecular data for system behavior.

The present program is among the few that integrate all relevant information from various scientific fields (e.g. molecular biology, biochemistry, and physics) into a single model for cell function.

Until now bioinformatic approaches to the dynamics of cell function have remained 'limited' to categorization of all enzymes, to flux analysis delimitation of metabolic, to metabolic pathway identification, and to the computational biochemistry of isolated metabolic pathways at steady state. For the first time metabolic pathways, their regulation, signal transduction and structure-flux relations will be addressed in a single context, using computational biochemistry, i.e. calculating dynamic concentrations and process rates from molecular data.

 

d. Relevance for Biomolecular Informatics

The research program calculates from molecular biology (and physics) to cell function. Therewith it will generate knowledge and insight from large amounts of information. It integrates information from all relevant sources, from DNA through gene expression to metabolomics, including kinetic and physical chemical data. It calculates expected dynamic structures, functions and dynamics at the supramolecular level. It will result in two interactive computationally 'live' replica of significant parts of two living cells (one prokaryotic, one eukaryotic), an interactive system accessible to outside researchers for mining of the models. The Silicon Cell will become an international repository of all relevant molecular data. It will focus on regulatory and other networks of gene expression, signal transduction and complex biochemical pathways. The program will develop computational technologies for modeling cell processes and for integrating all the available molecular information into the model.

- cellular bioinformatics.

- computational biochemistry

- the living cell

- metabolic control analysis, regulation

-model validation and calibration

-dynamic structures

-cellular control hierarchies

- mesoscopic/particle based

-partial differential equations in living cells

-functional modules

The experimental part:

 

The experimental part consists of many projects that are running in the BioCentrum Amsterdam. In addition a number of grant applications have been submitted:

1. To IOP-Genomics

2. FANCY Metabolomics (to NWO-Genomics):

a1. Overall aim

The availability of complete genomes has identified many genes of which the function is unknown, uncertain, or unproven. In many cases this is because the phenotype of these genes is absent, weak, or indirect; we call these the (silent and) whispering genes. Much of ultimate function resides at the flux and metabolite concentration ('Metabolome') level. It is the aim of this proposal to inspect the functioning of large numbers of these genes systematically at the level of metabolism.

a2. Key objectives

-Elaboration and application of our FANCY (1&2) approach so as to include whispering genes

[FANCY: Functional Analysis through Coresponses of Yeast]

- Elaboration and application of our Metabolic Regulation Analysis approach so as to identify sites and modes of regulation at the metabolome level

- Elaboration and application of differential mass spectrometry of complete metabolomes in a finger printing mode

- Elaboration and application of mass spectrometry/bioinformatics/enzyme assays approach to the tracing of sites of metabolome regulation

-Estimate of the number of independent regulatory modes of the yeast metabolome

-Identification of the mode of regulation by a number of insufficiently known yeast genes

-Identification of the mode of regulation by a number of unknown (silent and whispering) yeast genes

-Quantitative determination for a sizeable number of yeast reactions of the extents to which they are regulated by gene expression, by signal transduction and metabolically

b. Approach

This project proposal aims to combine cutting edge mass spectrometry with a recent development in metabolic genomics (FANCY), and a recent development in the field of quantitative metabolic regulation.

b1. FANCY-2

FANCY-1 noted that genes with silent phenotype at the flux level (no change in flux) should have altered metabolite concentrations, and proposed to classify mutants therefore on the basis of the metabolome. It was only validated for mutants in phosphofructokinase-2, by means of enzymatic metabolite assays and NMR. Here we shall elaborate the method for mass spectrometry in metabolomics, implement it to the analysis of genes with truly unknown function, and use it to address questions concerning the regulatory complexity of the metabolome. In contrast to NMR, mass spectrometry will allow to identify and quantify specific metabolites. In addition we shall add a related method, called FANCY-2, which addresses the cases where there is a change in flux and also compare changes in metabolite concentrations and flux. This will identify which of the enzymes in a pathway is affected by the mutation and how. Mutants will be grouped according to co-response pattern and then further analysed by zooming in, identifying, and performing more directed metabolite assays.

b2. Differential Regulation

The metabolic regulation approach recently developed by Ter Kuile and Westerhoff [2001] compares changes in flux through an enzyme to the change in enzyme concentration. Therefrom it deduces the extent to which the change in flux is regulated by changes of protein concentration (i.e. through gene expression regulation) or otherwise, e.g. metabolically. Here the approach is extended to distinguish between metabolic and covalent modification regulation.

Also, where the earlier approach focused on regulation by pathway substrate, here the concepts are broadened to regulation by any other gene product in the cell, and ultimately by all elements of the genome. The approach developed should allow one to determine for any process to what extent it is regulated by any gene product and to which relative extents that regulation runs through transcription regulation, covalent modification, or metabolic regulation.

b3. Mass spectrometry

A novel method for metabolome analysis will be elaborated and applied, using reversed phase HPLC coupled with electrospray ionization (quadrupole) time-of-flight mass spectrometry (LC-MS), in combination with derivatization of metabolites and use of stable isotopic markers, allowing quantitative determination. We will first obtain fingerprints of the entire metabolome to be used directly for our FANCY analysis without previous identification of peaks in the mass spectrum. Then interesting metabolites will be identified and intracellular concentrations will be determined for our Metabolic Regulation Analysis approach.

b4. The combined strategy

For a mutant the change in metabolic fluxes relative to wild type will be determined by a flux analysis. Metabolome information will be collected by mass spectrometry. Pattern analysis of the flux and concentration responses will be performed using Principal Component Analysis (PCA) and Discriminatory Function Analysis (DFA). Here the extension with respect to FANCY-1 is that not only the concentration coresponses are used to characterize the mutants but also the flux-concentration coresponses and the flux-flux coresponses. This will generalize the method for mutants that do affect fluxes, and will allow us to examine the degree of modularity of the metabolic functioning. The dimensionality of each pattern will be established. Patterns of unknown mutants will be compared to patterns of known mutants.

In pathways of interest accessible enzymes will be selected around which the following approach will be used. The change in flux through that enzyme will be decomposed in terms of the change in metabolites and the change in enzyme concentrations, using the following principle:

dlnv = dlne+ dlna + dlnf

Here dlnv represents the small (but it may also be large, see below) change of the in vivo flux through the enzyme of interest, dlne represents the change of protein concentration, dlna the change of specific activity (due to protein modification) and dlnf represents the change of catalytic rate as dependent on the metabolic influences on the enzyme. The equation states that the change of the logarithm of the flux through the step should be equal to the change of the logarithm of the enzyme concentration, plus the change of the logarithm of the specific enzyme activity, plus the change of the logarithm of the metabolic regulation of the enzyme rate. Change of specific enzyme activity is defined as the change resulting from a persistent modification of the enzymes such as phosphorylation. By measuring protein concentration the first term will be determined. By measuring enzyme activity in an extract the second term will be measured. The change in the metabolic term will be determined by subtraction of the other terms. The implications of the metabolic regulation term will be further elaborated by actually measuring the changes in the concentrations of the metabolites that affect the enzyme. In total this should establish the regulatory effect of the mutation that is being analysed on the particular aspect of cell function.

If the flux of the pathway is affected, then the above equation can be divided by the change in flux and the metabolic regulation theorem of Terri Kuala and Westerhoff appears:

1 = dlne/dlnv + dlna/dlnv + dlnf /dlnv. When dlne/dlnv and dlna/dlnv are measured, dlnf /dlnv (the metabolic regulation) can be calculated from this equation.

If the change in flux is zero, then the approach highlights that a change in metabolite concentration should be high for enzymes of which the concentration or the specific activity is affected by the mutation (the basis of the FANCY approach and coresponse analysis).

c. Elements of innovation

The intracellular characterization of cell functioning is hampered by the fact that there are several hundreds of metabolites (the metabolome), that may all respond directly or indirectly to functional changes. The approach elaborated here has the following elements of innovation:

  1. It achieves a characterization of how a gene affects the metabolome through a mass spectrum
  2. It achieves emphasis on the shift in metabolome caused by an alteration in the activity of whispering genes, through differential derivatization based on butylation with H and D of the wild-type and the mutant sample
  3. It can reach appreciable understanding without identifying the peaks in the spectrum (pattern analysis; finger printing)
  4. In a second phase and only for genes that have demonstrated to be interesting, the approach can zoom in and identify the actual metabolites at play
  5. Identification of both silent and whispering genes by coresponse analysis of both fluxes and concentrations (FANCY-2)
  6. Measurement of how much regulation of a metabolic process runs through gene-expression how much through metabolic regulation and how much through direct signal transduction regulation
  7. Application of state-of-the art Metabolic Control Analysis methods to functional genomics
  8. Insight in how many ways yeast uses to respond to internal and external perturbations (i.e. to regulate itself)
  9. Venue towards the complete identification of the metabolome of yeast

d1. Relevance for NWO GENOMICS program

The NWO GENOMICS program emphasizes the functional aspect of genomics, i.e. the understanding of how the genome leads to the functionality of the organism. A significant aspect of functionality resides at the level of material fluxes. An ultimate example is the material flux 'growth rate', which constitutes the coherent synthesis of a set of biomolecules that constitutes a new living organism. In fact each of the biosynthetic, but also each of the catabolic fluxes constitutes a functionality of the genome. Characteristically, to each of these fluxes a number and ultimately all the expressed genes of the organism may contribute to some extent. The fluxes run through individual (enzyme catalyzed) reactions, which are determined by gene expression (through the concentrations of the catalyzing enzyme) and by metabolites around the enzyme (through kinetics). By addressing this issue, this research program is directed at the core of the NWO GENOMICS program, i.e. at how proteome and metabolome together lead to the functioning living cell. To keep things manageable we here study the unicellular microorganism S. cerevisiae that has been characterized well and for which many of the molecular genetic tools and constructs are available to us (as ex-participants in EUROFAN).

d2. Contribution to national genomics infrastructure

  1. The BioCentrum Amsterdam is nationally unique in the development and application of quantitative analyses of metabolic and hierarchical control and regulation. It has also participated in the EUROFAN program of functional analysis of the yeast genome and is as such the national exponent of an international group of laboratories.
  2. The mass spectrometry set-up at the BioCentrum Amsterdam, although not unique in the Netherlands, is one of two with a direct activity in Metabolomics. With the Delft group (Heijnen) that has similar possibilities, the Amsterdam Group (Westerhoff) collaborates in many projects (STW [granted], IOP-Genomics [proposal]). If this proposal is granted the research will certainly proceed in close collaboration with the Delft group. The approach of the Delft group is complementary to ours, since Heijnen et al. aim at identifying and quantifying specific metabolites, while we characterize metabolome patterns and will only in a later phase of the project identify and quantify the peaks. We also collaborate with the group of Dr. van Gennip (AMC) who uses butylation to detect metabolites in blood samples of patients by LC/MS.
  3. A central research line of the BioCentrum Amsterdam is its 'Silicon Cell' program, which has the long-term goal of the calculation of the living cell. At present ICES-KIS, VUA, UvA, and CWI support this program, and NWO support for the mathematical side has been applied for (NWO-Bioinformatics).

differential metabolomics, differential mass spectrometry, regulatory patterns, control and regulation analysis, integrative bioinformatics, functional analysis of the yeast genome, silent and whispering genes, hierarchies in control, functional genomics