BRAINWORKS is a web platform, developed at the NIH, that structures the scientific literature from PubMed and NIH RePORTER as a dynamic and interactive knowledge graph.




Project Introduction:

The Need: The scientific knowledge landscape is vast, complex and rapidly expanding. In 2020, an additional 2 million new peer-reviewed papers were added to the scientific literature, which is now estimated to contain over 60 million works. At this volume, it would take a single individual almost 20 years (without breaks) to perform a 5-minute review of each paper written in 2020. Even narrow subdomains of scientific investigation now produce a level of output that is intractable for a single scholar to master: over 100,000 papers about the coronavirus pandemic were published in 2020, alone.

The Solution: As knowledge generation continues to outpace the ability of individual scientists to consume and integrate it, there is a critical need for technology tools that can organize, integrate, and represent the nuanced knowledge contained within the growing body of the scientific literature. BRAINWORKS is a web platform that addresses these needs by structuring the scientific literature as a dynamic and interactive knowledge graph.

The Innovation: BRAINWORKS is innovative because of its ability to represent scientific knowledge as well as the context governing its creation (funding, grants, authors, etc.). Furthermore, it provides a novel way to visualize the temporal evolution of scientific knowledge.


Key Capabilities:

BRAINWORKS displays knowledge emebeded within public domain scientific papers (and associated meta data; e.g. authors, journal, publication date, etc.) using Natural Language Processing (NLP). In it's current state, BRAINWORK provide four types of literature analytics:

Current Capabilities:

  1. Summarize key finding from a single paper: Explore semantic triples embedded within a single paper. 
  2. Learn what is known about a topic: Explore topical knowledge implied by by semantic triples distributed across millions of scientific papers.
  3. Identify important contributions: Explore citation networks to understand which authors and papers have had a big impact on a field.
  4. Explore research trends: Visualize trends in in the occurrence and co-occurrence of topics in the literature.


Technical Implementation

The technology stack for BRAINWORKS is open source under the MIT License, and consists of three layers: information, algorithms, visualization. Each layer was designed to function independently to maximize extensions of the technology stack for other use cases. The 2021 release is available via GitHub at


Alpha Development Activities

The  alpha version of the platform was completed in four phases over the course of 2021; information about each phase, along with videos, documentation, working group meetings and software are referenced below. Members of the working group assisted with the development of the BRAINWORKS platform. More specifically, members of the working group participated in meetings at the completion of each project milestones to provide feedback, as shown in the TABLE 1 below.

TABLE 1: Overview of Working group meetings dates and topics.

Project Phase Meeting Information: Meeting Date Resources Meeting Link

1. Specification

Working Group Discussion:

A. Do you have great interest in accessing the raw data that powers BRAINWORKS, or are you mostly interested in using the final tool? 

B. What are some visuals/queries of the collected data that would help you assess its value? Feel free to sketch something, or link to visuals online.

C. In what ways should the tool allow users to constrain the nodes and the edges of the generated knowledge graph? (e.g. exclude nodes based on authors, set edge weights based on citations.) Are there specific node or edge constraints that you're interested in being able to apply over time (e.g. only show portions of the graph with increasing citations over time)?

D. We are planning a debate around the viability of data science approaches for neuroscience theory integration during the Society of Neuroscience meeting in August. Would this topic interest you, why or why not? 




2. Data

Brain Meeting Virtual Booth :

A. Data collection approach

B. Data characteristics

C. Data access and distribution approach

6/15/2021 - 6/17/2021

Demo Schedule 

6/15 Demo
Starts 12:30PM EST 

6/16 Demo
Starts 5:00PM EST

6/17 Demo
Starts 5:00PM EST

3. Algorithms

9th Annual Virtual Neuroscience Conference:

A. Algorithms for theory representation

B. Algorithms for theory evolution prediction

C. Performance evaluation


National Advisory Council of Biomedical Imaging and Bioengineering (NACBIB)








Agenda (click on Topics tab)

Conference Registration







NACBIB Presentation (begin at 3:22)

4. Interface

Live Proof-of-concept:

A. Demonstration of Platform alpha version

B. Review of code and documentation














Beta Development Activities

The beta version of the platform is continuing to be developed in 2022; Additional information about current and prospective beta development activities will be provided in TABLE 2 below

TABLE 2: Overview of Beta Development Activities.

Feature Status Details Completed / Projected Date

User Management System


A variety of basic features for our website that allow users to securely sign up for an account, perform secure authentication, manage profile information, etc.

March 2022

Search Capability Enhancements


Features that allow users to query the BRAINWORKS database by authors, paper identifiers, topics, relations, and other advanced criteria of interest.

April 2022

Visualization Tool Enhancements


Enhancements to the Knowledge Graph by (a) clustering extracted concepts with similar semantic meaning, (b) filtering out uninformative or non-specific concepts, (c) allowing users to see semantic triples in the context of the source document and (d) allow all semantic triples from a single paper to be viewed.

May 2022

Additional Website Tools


We will add 2 additional views of the data beyond the knowledge graph: a paper citation network and a tool to search for scientific topic trends. We additionally improved the website by adding failsafes for users like password recovery.

June 2022

Rewrite of Plot Visualization Tool


We are collecting feedback on the design and layout of the visualization tool and will implement new features and adjustments for a better user experience. This will involve an extensive rewrite of our external GraphAPI tool, which will allow for much greater flexibility in designing the interactive portion of the website.

July 2022

UMLS Concept Embedding


We will create a new machine-learning model which embeds UMLS concepts within a 5-dimensional vector space. This will allow us to better construct visuals which group similar concepts.

August 2022

Database and Algorithm Optimization


We will be significantly improving the speed and cost efficiency of our back-end data collection and extraction algorithms, reducing costs to run our parallel computing cluster by 80%. We will also be moving our main data storage method to AWS RedShift, which will further our optimization efforts and allow greater expansion in the future.

September 2022

Novelty Metric Investigation


We will be investigating the efficacy of a unique method for measuring publication novelty and impact on the scientific community, and will eventually incorporate it into the website as a mechanism for surfacing important findings.

October 2022

Website Overhaul


We will be completely retiring the old website design in favor of a more modern approach, rewritten from the ground up.

November 2022



























BRAIN Theories, Models and Methods (TMM) projects

All projects are encouraged to utilize the NIH BRAINWORKS platform (that organizes, integrates, and represents nuanced knowledge contained within the growing body of the scientific literature) to assist in the development of Theories, Models and Methods for understanding brain circuits from the cellular and subsecond resolution to behavior.  Please contact, and use  "Subject line: [BRAINWORKS]:" if you have any questions.