BRAINWORKS | Interagency Modeling and Analysis Group

BRAINWORKS is a web platform, developed at the NIH, that structures the scientific literature from PubMed and NIH RePORTER as a dynamic and interactive knowledge graph.

Announcements:

05/2023: BRAINWORSK is available on the Federal GitHub, https://github.com/NIBIB/BRAINWORKS
02/2023: BRAINWORKS is no longer publicly available at brainworks.scigami.org. For questions about the platform, including a potential release date, please contact BRAINTheoriesFOA@ninds.nih.gov, and use "Subject line: [BRAINWORKS]:".
12/2022: BRAINWORKS is the subject of an Invited Talk at NeuroIPs, 2022.
11/2022: BRAINWORKS is the subject of an Invited Talk at the ACM International Conference on AI in Finance: ICAIF'22.
06/2022: BRAINWORKS alpha platform was demonstrated at BRAIN Initiative Meeting, 2022; see Interactive Poster.
05/2022: BRAINWORKS is the subject of an Invited Talk at AIMed Global Summit, 2022.

Project Introduction:

The Need: The scientific knowledge landscape is vast, complex and rapidly expanding. In 2020, an additional 2 million new peer-reviewed papers were added to the scientific literature, which is now estimated to contain over 60 million works. At this volume, it would take a single individual almost 20 years (without breaks) to perform a 5-minute review of each paper written in 2020. Even narrow subdomains of scientific investigation now produce a level of output that is intractable for a single scholar to master: over 100,000 papers about the coronavirus pandemic were published in 2020, alone.

The Solution: As knowledge generation continues to outpace the ability of individual scientists to consume and integrate it, there is a critical need for technology tools that can organize, integrate, and represent the nuanced knowledge contained within the growing body of the scientific literature. BRAINWORKS is a web platform that addresses these needs by structuring the scientific literature as a dynamic and interactive knowledge graph.

The Innovation: BRAINWORKS is innovative because of its ability to represent scientific knowledge as well as the context governing its creation (funding, grants, authors, etc.). Furthermore, it provides a novel way to visualize the temporal evolution of scientific knowledge.

Key Capabilities:

BRAINWORKS displays knowledge emebeded within public domain scientific papers (and associated meta data; e.g. authors, journal, publication date, etc.) using Natural Language Processing (NLP). In it's current state, BRAINWORK provide four types of literature analytics:

Current Capabilities:

Summarize key finding from a single paper: Explore semantic triples embedded within a single paper.
Learn what is known about a topic: Explore topical knowledge implied by by semantic triples distributed across millions of scientific papers.
Identify important contributions: Explore citation networks to understand which authors and papers have had a big impact on a field.
Explore research trends: Visualize trends in in the occurrence and co-occurrence of topics in the literature.

Technical Implementation

The technology stack for BRAINWORKS is open source under the MIT License, and consists of three layers: information, algorithms, visualization. Each layer was designed to function independently to maximize extensions of the technology stack for other use cases. The 2021 release is available via GitHub at https://github.com/deskool/brainworks-public.git

Alpha Development Activities

The alpha version of the platform was completed in four phases over the course of 2021; information about each phase, along with videos, documentation, working group meetings and software are referenced below. Members of the working group assisted with the development of the BRAINWORKS platform. More specifically, members of the working group participated in meetings at the completion of each project milestones to provide feedback, as shown in the TABLE 1 below.

TABLE 1: Overview of Working group meetings dates and topics.


Project Phase	Meeting Information:	Meeting Date	Resources	Meeting Link
1. Specification	Working Group Discussion: A. Do you have great interest in accessing the raw data that powers BRAINWORKS, or are you mostly interested in using the final tool? B. What are some visuals/queries of the collected data that would help you assess its value? Feel free to sketch something, or link to visuals online. C. In what ways should the tool allow users to constrain the nodes and the edges of the generated knowledge graph? (e.g. exclude nodes based on authors, set edge weights based on citations.) Are there specific node or edge constraints that you're interested in being able to apply over time (e.g. only show portions of the graph with increasing citations over time)? D. We are planning a debate around the viability of data science approaches for neuroscience theory integration during the Society of Neuroscience meeting in August. Would this topic interest you, why or why not?	5/21/2021	Videos: Introduction Motivation and approach Platform requirements	Register
2. Data	Brain Meeting Virtual Booth : A. Data collection approach B. Data characteristics C. Data access and distribution approach	6/15/2021 - 6/17/2021	Demo Schedule	6/15 Demo Starts 12:30PM EST 6/16 Demo Starts 5:00PM EST 6/17 Demo Starts 5:00PM EST
3. Algorithms	9th Annual Virtual Neuroscience Conference: A. Algorithms for theory representation B. Algorithms for theory evolution prediction C. Performance evaluation ------------------------------- National Advisory Council of Biomedical Imaging and Bioengineering (NACBIB)	8/25/2021 ------------- 9/14/2021	Agenda (click on Topics tab) Summary of BRAIN talks2.29 MB	Conference Registration ------------------------------------ NACBIB Presentation (begin at 3:22)
4. Interface	Live Proof-of-concept: A. Demonstration of Platform alpha version B. Review of code and documentation	11/12/2021	N/A	N/A

Beta Development Activities

The beta version of the platform is continuing to be developed in 2022; Additional information about current and prospective beta development activities will be provided in TABLE 2 below

TABLE 2: Overview of Beta Development Activities.


Feature	Status	Details	Completed / Projected Date
User Management System	Complete	A variety of basic features for our website that allow users to securely sign up for an account, perform secure authentication, manage profile information, etc.	March 2022
Search Capability Enhancements	Complete	Features that allow users to query the BRAINWORKS database by authors, paper identifiers, topics, relations, and other advanced criteria of interest.	April 2022
Visualization Tool Enhancements	Complete	Enhancements to the Knowledge Graph by (a) clustering extracted concepts with similar semantic meaning, (b) filtering out uninformative or non-specific concepts, (c) allowing users to see semantic triples in the context of the source document and (d) allow all semantic triples from a single paper to be viewed.	May 2022
Additional Website Tools	Complete	We will add 2 additional views of the data beyond the knowledge graph: a paper citation network and a tool to search for scientific topic trends. We additionally improved the website by adding failsafes for users like password recovery.	June 2022
Rewrite of Plot Visualization Tool	Complete	We are collecting feedback on the design and layout of the visualization tool and will implement new features and adjustments for a better user experience. This will involve an extensive rewrite of our external GraphAPI tool, which will allow for much greater flexibility in designing the interactive portion of the website.	July 2022
UMLS Concept Embedding	Complete	We will create a new machine-learning model which embeds UMLS concepts within a 5-dimensional vector space. This will allow us to better construct visuals which group similar concepts.	August 2022
Database and Algorithm Optimization	Complete	We will be significantly improving the speed and cost efficiency of our back-end data collection and extraction algorithms, reducing costs to run our parallel computing cluster by 80%. We will also be moving our main data storage method to AWS RedShift, which will further our optimization efforts and allow greater expansion in the future.	September 2022
Novelty Metric Investigation	Complete	We will be investigating the efficacy of a unique method for measuring publication novelty and impact on the scientific community, and will eventually incorporate it into the website as a mechanism for surfacing important findings.	October 2022
Website Overhaul	Complete	We will be completely retiring the old website design in favor of a more modern approach, rewritten from the ground up.	November 2022

BRAIN Theories, Models and Methods (TMM) projects

All projects are encouraged to utilize the NIH BRAINWORKS platform (that organizes, integrates, and represents nuanced knowledge contained within the growing body of the scientific literature) to assist in the development of Theories, Models and Methods for understanding brain circuits from the cellular and subsecond resolution to behavior. Please contact BRAINTheoriesFOA@ninds.nih.gov, and use "Subject line: [BRAINWORKS]:" if you have any questions.