THEME 3- Data-Driven Approaches | Interagency Modeling and Analysis Group

Back to main agenda

Session Description: Data is the heart of modeling. Whether we are providing mechanistic models using multiscale modeling, statistical models, or deep learning (DL) models, the more data we have the more accurate and useful these models will be. In fact the DL revolution has occurred as much from the availability of big data as from progress in algorithms and network architecture.

The session consists of two 30-minute keynote lectures from experts in data science and machine learning in biology and medicine, followed by an open discussion led by the moderators.

Keynote Lecture: Origins of ECoG Signals

Abstract: ECoG is a critical methodological bridge between basic neuroscience findings and our understanding of the human brain in health and disease. However, adoption of ECoG for basic neuroscience, and realizing its full potentials in humans, is impeded by a lack of understanding of the precise biophysical processes that generate mesoscale cortical surface electrical potentials (CSEPs) recorded by ECoG. Here, we use direct electrophysiological monitoring in rats combined with biophysically detailed simulations to reveal the origins of distinct frequency components of ECoG-recorded CSEPs. Our central hypothesis is that distinct CSEP components are generated by neuronal sources with different spatial localization within a cortical column.

msmai_bouchard.pptx124.16 MB

Keynote Speaker Bio: Dr. Kristofer Bouchard is PI of the Neural Systems and Data Science Lab at Lawrence Berkeley National Laboratory (LBNL) and UC Berkeley. He received his PhD in Neuroscience in 2010 from UC San Francisco, and held postdoctoral fellow positions at UCSF Medical Center 2011-2014 and at LBNL 2014-2015. His interdisciplinary research focuses on understanding how distributed neural circuitsgive rise to coordinated behaviors and perception, and approaches this problem by conducting in vivo neuroscience experiments and developing data science tools. His neuroscience research focuses on functional organization and dynamic coordination in the brain by combining in vivo multiscale electrophysiology and optogenetics in rodents with biophysically detailed simulations. On the data science side, he develops analysis tools for neuroscience, including statistical-machine learning algorithms, dynamic graphical models, and data standards and formats.

Keynote Lecture: Accelerating Therapeutics for Opportunities in Medicine

Abstract: The drug discovery process is costly, slow, and failure-prone. It takes an average of 5.5 years to get to the clinical testing stage, and in this time millions of molecules are tested, thousands are made, and most fail. One particularly difficult stage of drug discovery is de novo design of therapeutic agents. This process relies on a large high-throughput screens and several follow-up cycles of iterative design to enhance potency, eliminate safety liabilities, and enable favorable pharmacokinetic behavior. To address this bottleneck, we present an end-to-end computational platform that can perform rapid de novo compound generation in silico using machine learning techniques. Our platform allows the user to curate datasets; build machine learning models that predict key safety, efficacy, and pharmacokinetics parameters; and incorporate these models into a multi-parameter optimization loop for generative molecular design. Using this platform, we have created tens of thousands of deep learning and random forest models, which predict a variety of key safety and pharmacokinetic parameters with a high degree of accuracy. The best-performing have then been integrated into our in silico generative molecular design loop. We have performed a proof-of-concept compound-generation exercise using the platform and have experimentally validated the de novo compounds with promising results. We are confident that our work building this computational platform will help to transform drug discovery from a time-consuming, sequential, and high-risk process into an approach that is rapid, integrated, and with better patient outcomes.

minnichimag.pptx19.88 MB

minnichimag.pdf3.24 MB

Keynote Speaker Bio: Dr. Amanda J. Minnich is a Machine Learning Research Scientist and Molecular Data-Driven Modeling Team Lead at Lawrence Livermore National Laboratory (LLNL). At LLNL, she is part of the multi-institution ATOM Consortium, where she applies Machine Learning techniques to biological data for drug discovery purposes. Dr. Minnich received a BA in Integrative Biology from UC Berkeley (2009) and an MS (2014) and PhD with Distinction (2017) in Computer Science from the University of New Mexico. She has published her work at and served on Program Committees for top conferences including WWW, ASONAM, KDD, ICDM, SC, GTC, and ICWE, and has been issued a patent for her dissertation work.

Moderator Bios:

Dr. William Lytton is Distinguished Professor at SUNY Downstate Brooklyn and is a practicing neurologist at Kings County Hospital. His work focuses on using computational neuroscience to forge links between disparate findings from brain function, with applications to brain diseases including epilepsy, stroke, schizophrenia, dystonia, Parkinson's and Alzheimer's.

Dr. Linda Petzold is Distinguished Professor in the Departments of Mechanical Engineering and Computer Science at UC Santa Barbara. Her work focuses on computational methods, mathematical modeling and machine learning, with application to a wide range of problems from systems biology, neuroscience and engineering.

Topics for Discussion:

1. How can multiscale modeling of biomimetic neuronal networks (BNNs) assist us in identifying how artificial neural networks (ANNs) created by deep learning work? Can we identify common ensemble neuronal codes?

2. How can twinning models with biological processes (bio/model) be extended to tripling (bio/DL/MSM) -- DL and MSM models operating in parallel to explicate biology and pathology?

3. How can data-driven computational neuroscience be used to improve state of the art in DL/AI? If so, what specific mechanisms/approaches can we leverage? -- STDP? neuromodulation? spiking?

4. Optimally, at what point in the scientific process should experts in data analysis, machine learning and multiscale modeling become involved? How can we make this happen?

5. What are the challenges that multiscale systems present for data aquisition, data analysis, machine learning and multiscale simulation and inference of the structure and parameters of a model?

6. To what extent can statistics now merge with machine learning? ML involves an iterative approach to parameter determination, whereas classical statistics gives a closed-form parameter determination. However, an accurate statistical measure depends on an accurate determination of the underlying distribution; this determination is often simply an assumption. ML techniques can now test these assumptions/parameterizations iteratively and identify best fits with different Bayesian priors.

Table sorting checkbox

Off