Developing Tools | Interagency Modeling and Analysis Group

This is a working resource page for the Tools activities of the Bridge2AI program.

The Bridge2AI program will use this wiki to bring together resources relevant to the Tools Modules within the Data Generation Projects and the Tools Core within the BRIDGE Center.

ANNOUNCEMENTS:

RESOURCES:

DOE Center for Advanced Mathematics for Energy Research Applications (CAMERA)

ANL - FAIR for AI Workshop*

ORNL - AIRES 3: Machine Learning for Robust Digital Twins*

*Workshop reports coming soon -- see IMAG wiki: Meeting Reports page (under Resources tab)

Tools Developed for the DARPA SD2 Program

Aquarium enables scientists to precisely define executable experimental workflows that generate bench instructions and automatically capture data and provenance, Distribution: https://www.aquarium.bio/ under MIT license

Escalation: an open-source web application for data visualization, exploration, and sharing. https://www.twosixlabs.com/escalation/; jed.singer@twosixtech.com; nick.leiby@twosixtech.com for more information or requests

The Versioned Data Repository allows for centralized storage, updating, sanity-checking, and provenance-tracking of experimental data. It helps to ensure that distributed teams are all working with the correct version of a dataset, even as data are refined, revised, corrected, and added to. Custom sanity checks help to ensure that only correct data are added to the repository. It is currently deployed to the Texas Advanced Computing Center HPC system. jed.singer@twosixtech.com and nick.leiby@twosixtech.com for more information or requests

Experimental Intent Parser: Experiment plans are typically written in idiosyncratic prose and often omit important details. Intent Parser helps researcher annotate and structure their plans to ensure they are complete and unambiguous. jake.beal@raytheon.com

NIH All of Us

Demo notebooks are available for all users of the All of Us Researcher Workbench – the cloud-based platform for accessing and analyzing All of Us Data in a secure environment. Our setup is a little different than other large biomedical data conglomerates, and probably most similar to the Census Bureau in general ethos.

The All of Us data ecosystem is composed of three tiers – public (containing aggregate data which does not pose significant re-identification risk), registered (containing data with fuzzed dates and elements like EHR, surveys, and physical measures with some greater risk for re-identification), and controlled (containing genomics data, location, and other data with a significant re-identification risk, but simultaneously of greatest use to some researchers). The different tiers have different burdens for access – public can be accessed by anybody, but doesn’t offer significant compute power, registered is (currently) accessible by users with eRA commons accounts who have taken privacy/ethics training and have a data use agreement signed by their institution, and controlled tier (in pre-Alpha testing right now) will have registered tier requirements in addition to some more security and training requirements.

Demo workbooks are available within the registered and controlled tiers to provide users with frameworks and code snippets to build out analysis. They are also designed to show off some of the capabilities within the workbench which may not be self-evident upon first exploration.

The public resources users can look at to understand our platform’s capabilities are within the public tier, and are primarily to let users know what numbers and types of participants we have for analysis, for those thinking about going through the effort of setting up a Data Use Agreement.

Tools Developed by Investigators of the Intensive Longitudinal Health Behaviors Network (ILHBN)

Dynamic modeling in R (dynr). https://dynrr.github.io/. Scroll down on first page for tutorials. Also on CRAN
GIMME. Also on CRAN. http://gimme.web.unc.edu/63-2/; https://cran.r-project.org/web/packages/gimme/index.html; introduction: http://gimme.web.unc.edu/
Bayesian Ornstein-Uhlenbeck Model (BOUM) Package. Contains Matlab compiler run time is Matlab is not required to run BOUM. https://sites.psu.edu/zitaoravecz/bayesian-ornstein-uhlenbeck-model/
Various other tutorials and resources on the QuantDev website https://quantdev.ssri.psu.edu/resources
R package tsfeatures for time series features calculation: https://cran.r-project.org/web/packages/tsfeatures/vignettes/tsfeatures.html
Nelson Roque’s tsfeaturex package: https://github.com/nelsonroque/tsfeaturex
A Python library for building spatial data and calculating buffer- and Convex hull-based activity space from raw GPS data (gps2space https://pypi.org/project/gps2space/)
 DP Dash-A tool that enables you to visualize multiple streams and configure based on your needs. https://sites.google.com/g.harvard.edu/dpdash/
https://github.com/harvard-nrg/dpdash
Signaligner-Pro is an interactive tool for algorithm-assisted exploration and annotation of raw accelerometer data. (http://signaligner.org/)
Cerebral Cortex: Population-scale Model Learning and Data Analytics https://md2k.org/software-under-the-hood.html https://github.com/MD2Korg/CerebralCortex

NIH BRAIN Initiative

Brainstem (brain structured experimental metadata) is a tool for describing, organizing, and sharing experimental data. It works as an electronic notebook, providing a centralization and standardization of experimental metadata. A prototype of this tool has been deployed in the Buzsáki Lab, and a beta version for broad distribution is currently under development.

Tools Supported by the NCI Informatics Technology for Cancer Research (ITCR) Program

The ITCR Program funds tools that support the analysis of –omics, imaging, and clinical data, as well as network biology and data standards. All of the tools are free for use by academic and non-profit researchers. Access to tools, code repositories and introductory videos is available at the ITCR tools page.

Tools Supported by the NCI Cancer Systems Biology Consortium (CSBC) and Physical Sciences in Oncology Network

The NCI Division of Cancer Biology supports multiple research programs composed of interdisciplinary communities of scientists who aim to integrate approaches, data, and tools to address important questions in basic and translational cancer research. Discover and download datasets, publications, and other resources generated by these programs. Specifically, the supported tools can be found here.

Other Tools supported by NIH

Project InnerEye Open-Source Software for Medical Imaging AI

Back to Bridge2AI Main Page

Table sorting checkbox

Off