Skip to content

A citation graph of COVID-19 publications based on the CORD-19 open research dataset

Notifications You must be signed in to change notification settings

poloclub/cord-19-citation-graph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

CORD-19 Citation Graph Generator

Simple python script that extracts citation graphs of COVID-19 publications from the CORD-19 open research dataset.

Last updated for CORD-19 release 6/2/2020.

This graph is included as a sample graph in Argo Lite.

Since the CORD-19 dataset has updated schema and introduced breaking changes, the code for one release of the dataset might not work for another. Please check the CORD-19 change log.

Setup

Download the CORD-19 dataset. Unzip and organize into the following directory structure (relative to generate.py at the root repository folder).

generate.py
cord-2020-06-02/
    metadata.csv
    document_parses/
        pdf_json/*
        pmc_json/*

You can name the dataset folder (cord-2020-06-02 above for illustration) to whatever you want, just remember to change the paths in the code.

About

A citation graph of COVID-19 publications based on the CORD-19 open research dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages