We Build a Knowledge Graph on COVID-19

The COVID❋GRAPH project is a voluntary initiative of graph enthusiasts and companies with the goal to build a knowledge graph with relevant information about the COVID-19 virus.

What We Do

We build a knowledge graph on COVID-19 that integrates various public datasets. This includes relevant publications, case statistics, genes and functions, molecular data and much more. 

Open Data in Neo4j

The graph is implemented in Neo4j. A public version is available at: 

There’s also HTTP access available in case of TSL/SSL issues:

User: public
Password: corona 

Note: There is an issue with Chrome/Chromium and SSL. Use Firefox/Safari. We are working on a solution.



The schema describes the structure of the graph data. The diagram on the left was created with yEd Live and can be viewed in an interactive diagram editor provided by yWorks here.

Who we are

We are a diverse team of scientists, developers and data people from academia and industry. 
We are from Kaiser & Preusse, yWorks, Prodyna, Neo4j, University of Freiburg, Structr and more to come!



We integrate data from various sources and link them in our knowledge graph:

COVID-19 Open Research Dataset (CORD-19)

In response to the COVID-19 pandemic, the Allen Institute for AI has partnered with leading research groups to prepare and distribute the COVID-19 Open Research Dataset (CORD-19), a free resource of over 44,000 scholarly articles, including over 29,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community.https://pages.semanticscholar.org/coronavirus-research

The Lens COVID-19 Datasets

The Lens has assembled free and open datasets of patent documents, scholarly research works metadata and biological sequences from patents, and deposited them in a machine-readable and explorable form.https://about.lens.org/covid-19/

Ensembl Genome Browser

Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation. Ensembl annotate genes, computes multiple alignments, predicts regulatory function and collects disease data. Ensembl tools include BLAST, BLAT, BioMart and the Variant Effect Predictor (VEP) for all supported species.http://www.ensembl.org

NCBI Gene Database

Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.https://www.ncbi.nlm.nih.gov/gene

The Gene Ontology Resource

The Gene Ontology (GO) knowledgebase is the world’s largest source of information on the functions of genes. This knowledge is both human-readable and machine-readable, and is a foundation for computational analysis of large-scale molecular biology and genetics experiments in biomedical research.http://geneontology.org

Experimental data from clinical studies and molecular genetics


2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by Johns Hopkins CSSE

This is the data repository for the 2019 Novel Coronavirus Visual Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). Also, Supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL).https://github.com/CSSEGISandData/COVID-19

United Nations World Population Prospects 2019

The 2019 Revision of World Population Prospects is the twenty-sixth round of official United Nations population estimates and projections that have been prepared by the Population Division of the Department of Economic and Social Affairs of the United Nations Secretariat.https://population.un.org/wpp/

Use Cases

Scientists around the world have been given the task to work on covid19. Many publications around the virus have been published in the past months, but relevant work around the family of coronaviruses already existed before covid 19 appeared.

We would like to help researchers and scientists to quickly and efficiently find their way through the more than 40.000 existing publications. We provide them with tools that use artificial intelligence, advanced visualization techniques, and intuitive user interfaces to explore papers and patents around the family of the corona viruses, existing treatments and medications.

How to Help

We need your help to learn more about Covid-19! You can help with:

  • data analysis on the integrated data set
  • load more datasets to Neo4j
  • improve our website
  • communicate and share our project

Get in Touch

We use Matrix to communicate. It’s a free and federated Slack. 

Riot is the standard client for Matrix. 

How to get started: 

  • Download Riot or start the web app: https://about.riot.im/downloads
  • When asked to login/signup: create a free account on the matrix.org homeserver
  • Contact @mpreusse:matrix.org on Riot!

(If this does not work you can contact us by email: martin.preusse@gmail.com)DOWNLOAD RIOT

You may also like...