Hunter21jul07

Dear Dan, James, Jim, Maciek, et al,

Here are my views on the email discussion below on behalf of the CellML team - I'll use 'reply' then later add them to the wiki. I've added Poul Nielsen & Matt Halstead to the CC list in case they want to correct me or add anything (Poul is the primary architect of CellML and Matt is leading the Auckland effort on linking CellML models to the biology via the metadata).

CellML (similarly SBML) provides a standard for encoding the model structure, mathematics (using MathML), units and metadata for a model. It does not provide a 'standard' for encoding the biology - that is far harder and relies substantially on annotating the models with metadata that uses, for example, Biopax and bio-ontologies like GO. Currently the CellML-aware authoring/simulation tools like PCEnv (the open source s/w being developed in Auckland primarily by Andrew Miller) and JSim (the U Washington code) do not provide environments that allow the modeller to apply biophysical constraints such as charge conservation or mass conservation as the model is built. Clearly this would be a big advantage and we certainly aim to do this with PCEnv, as I suspect Jim will with JSim. CellML1.1 allows complex models to be constructed from imported components - which will be increasing important as we need to link together various types of cell model (electrophysiology, metabolism, signal transduction, gene regulation, etc). Another advantage in having standards like CellML & SBML is that it is possible to write code to provide automatic processing of the models to speed up execution through lookup tables and partial evaluation etc. It is also possible to create software to generate pathway diagrams etc (see below). These and other developments that build on the CellML standard and associated tools are currently underway in Auckland, Oxford, Osaka, Singapore and Sydney (see www.cellml.org/tools and also David -Andre- Nickerson's http://cellml.sourceforge.net/ site).

On the issue of curation: Of the 270 models currently in the CellML repository (www.cellml.org/models), 67 are curated to level 1 and 15 to level 2 (see www.cellml.org/repository-info/info for our definition of curation). We do appreciate that this is much less than desirable but it is simply a question of resources - we now have a full time curator (James Lawson) who is doing a great job but it is very time consuming work to figure out all the missing information in published models and to contact authors to help fill in the details (we are finding that most authors are very responsive to these requests). Others helping greatly with this effort are Penny Noble (in Oxford) and Catherine Lloyd. If anyone else feels like helping with this curation effort we would be delighted! With the recent major injection of funds through the EU ICT funding agency in Brussels, there is now the prospect of a significant ramping up of the CellML contribution to model sharing for the Europhysiome project in 2008.

On the issue of biologically valid models: I think we need to develop a number of interfaces, built on CellML & PCEnv, that are tailored to the various modeling communities because the requirements, such as the types of biophysical constraint, are different for each. It has been very helpful talking with Dan about his metabolic modeling requirements and the way he would like to have the equations generated automatically using databases of parameters for the set of species participating in biochemical reactions (including various cations such as K+, Mg2+ and H+ bound to metabolic substrates), the associated pKs, and the Gibbs free energy of formation for each species. Matt Halstead in Auckland is working on the metadata annotation schemes and workflows that will facilitate this and I am pushing to get these interfaces developed as soon as possible to build on the great start that has been made by Andrew Miller with PCEnv. Sarala Dissanayake (supervised by Poul Nielsen and Matt Halstead) is developing the visualisation standards and processes that, via the metadata model annotation, will allow the software to generate the pathway diagrams and Andrew is currently developing an API for the use of (the 2D XML graphics standard) SVG with CellML & PCEnv to provide better visualisation of these pathway models.

A few responses to comments below: Maciek: "Since both CellML and SBML can be used to code e.g. models which have imbalanced units, maybe it would make sense to include options in those two ML for users to specify units (perhaps this feature already exists) and then models with correct unit balance would be curated." PJH: Yes, in CellML you can specify equation parameters in whatever units you want. JSim has built-in checks for unit consistency and PCEnv will soon.

Jim: "My pont is NEITHER CellML or SBML are acceptable standards for dissemiination. They are simply repositories of untessted pieces of code, for the most part and do not set a standard." PJH: CellML & SBML are markup languages, not repositories of code. The model repositories are separate. Code is generated via the APIs (the C code available on the CellML model repository website, for example, is generated via the API). I think these MLs do set standards for model sharing - but they are just a beginning and we need to keep the momentum up to develop ways of dealing with biophysical correctness as discussed above.

Dan: "I suspect that Jsim is detecting (at least in part) scientific mistakes such as unit imbalance. This gets back to an old discussion that we have had with the CellML people. One thing that I took away from that discussion is that the ML developers are not going to worry about scientific standards at the level of the ML's. Maybe this makes sense if the applications tools (i.e., Jsim) take up the burden of introducing and maintaining scientific standards. Make sense? Or is this too simple-minded of a way of thinking about the issue? " PJH: I'm not sure what you mean by 'scientific standards' here - I would have thought we have gone a long way towards establishing scientific standards by having MLs with their associated APIs for the models. I'm guessing that you are referring here to the issue of biophysical constraints - in which case, as you know, we are working towards this.

Best wishes, Peter

Table sorting checkbox
Off