THEME 4 - Theory-Driven Breakout - Digital Twin

Back to Main Agenda

Back to THEME 4- Theory-Driven Approaches session page

Session Lead: Bill Cannon

IMAG Moderators: Virginia Pasour (ARL), Xujing Wang (NIDDK)

 

Breakout Session Notes:

  • Introductions (name and interest):
    • Session 1:
    • cuurent status, today's talk showed some rules laid out with symmetries and structures in living systems, however, most often we do not enough understanding to formulate the rules
    • effeciency of existing tools?
    • are there need to biology-inspired neuronetwork architecture?
    • a useful application of NN (neuronets): what more data need to be acquired? typically perform well on data follow the same distribution, but poorly if distribution changes; how to adequately train?
    • physics based rules are so complex and high level, but a clinical need take computational intensive algorithms
    • biology too complex for compared to systems in physics, 
    • interaction between simulation and NN
    • why mechanistic rules:  (1) data limited (2) explanability (3) overfitting;   
    • try to simulate the biological systems for now
    • NN w/o reinformce learning to go from data to mathematical description.   yesterday's talks included the formulation of ODE from ML
    • Bill:  ML as a library call perhaps the rigth way for now
    • mchenical engieering (ME):  how people use control?  other forms of NN:  NN are sensitive to small perturbation, a place theory can help
    • NN not good at temporal phenomenon
    • inductive bias, such as in convolutional NN;  
    • prior knowledge sometimes can make performance worse,  how?  example: phos protein prediction, prior: abundance predictive as activity; results: no.  reason: correlation at high level (statistical), but not at individual process level. could physics help  
    • LANL projects: adding many basic theories to the NN, by penalizing any choice that violate theories
    • maybe possibility testing, falsifying versus predicting, given the chateristics of the biological systems.  perhaps ML can be better utilized to identify what's wrong
    • Bill: relevance  i
    • Gary An:  if funding agencies call FOA: would be wrong to call for positive predictions;   problems for
    • Emma: open source (experts)curated data sets, more diverse data sets;  
    • John:  curation:   
    • need computational scientitist involved more in data generation, annotation
    • review panel bias:  lab scientists proposing computational approaches review better than computational scientists propose to design an experiment
    • physics guided generation of synthesis data construction, 
    • theory can guide to define boundaries for experimental designs
    • identify/design robust control mechanisms across board, 
    • simulated data + real data make ML perform better, transfer learning, multi-tasking learning
    • a few recent papers on synthetic and real data, synthetic data can have more variations, better train the algorithm;   transfer learning:  transfer from sythetic to real.
    • simulations of NN sometimes no way to know what features were used in each instance to achive the best prediction
    • Bayesian integration of a diverge collection of physics models
    •  
    • what are the pressing needs can be addressed by a funding call:  (1) combining approaches (data driven and theory-driven)  (2) sheer recognition of ML value on real world data, a set of know limitations of ML  (3) causility, predictability ...(4) in specific clinical domain: beleive on generating data, applying an algorithm, out meanful results;  pointing out the limitation of ML in biological domain, provide the perspectives of ML cooomunity.  (5) simulation and mathematical modeling power desmonstrated, how to integrate ML to further improve power  (6) enumerate the mean to incoporate prior knowledge:  contrain structure; define functional forms being explored; generate sythetic data; feature selection; structure if interconnecting parameters;   (7) DL limitation in Bio: cannot capture uncertainty;  (8) diverse  data sets for training
    •  in staitsitcal field, uncertainty in ML
    •  
  •  
  •  
    • Session 2:
  • Build on current state of the are Theory-Drive Models
    •  
  • Build on current state of the art models for Digital Twins
    •  
  • ML-MSM integration opportunities
    •  
  • Challenges ML-MSM modelers should address
Table sorting checkbox
Off