The reference code of Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation.
- CNN-RNN-RNN (Liu et al., 2019)
- Knowing When to Look (Lu et al., 2017)
- Meshed-Memory Transformer (Cornia et al., 2020)
- Show, Attend and Tell (Xu et al., 2015)
- TieNet (Wang et al., 2018)
- MIMIC-CXR-JPG (Johnson et al., 2019)
- Open-i (Demner-Fushman et al., 2012)
The Radiology NLI dataset (RadNLI) is available at a corresponding PhysioNet project.
- A Linux OS (tested on Ubuntu 16.04)
- Memory over 24GB
- A gpu with memory over 12GB (tested on NVIDIA Titan X and NVIDIA Titan XP)
Create a conda environment
$ conda env create -f environment.yml
NOTE
: environment.yml
is set up for CUDA 10.1 and cuDNN 7.6.3. This may need to be changed depending on a runtime environment.
- Download MIMIC-CXR-JPG
- Make a resized copy of MIMIC-CXR-JPG using resize_mimic-cxr-jpg.py (MIMIC_CXR_ROOT is a dataset directory containing mimic-cxr)
$ python resize_mimic-cxr-jpg.py MIMIC_CXR_ROOT
- Create the sections file of MIMIC-CXR (mimic_cxr_sectioned.csv.gz) with create_sections_file.py
- Move mimic_cxr_sectioned.csv.gz to MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/
Pre-calculate document frequencies that will be used in CIDEr by:
$ python cider-df.py MIMIC_CXR_ROOT mimic-cxr_train-df.bin.gz
Pre-recognize named entities in MIMIC-CXR by:
$ python ner_reports.py --stanza-download MIMIC_CXR_ROOT mimic-cxr_ner.txt.gz
Download pre-trained CheXpert weights, pre-trained radiology NLI weights, and GloVe embeddings
$ cd resources
$ ./download.sh
First, train the Meshed-Memory Transformer model with an NLL loss.
# NLL
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll
Second, further train the model a joint loss using the self-critical RL to achieve a better performance.
# RL with NLL + BERTScore + EntityMatchExact
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchExact --rl-weights 0.01,0.495,0.495 --entity-match mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emexact
# RL with NLL + BERTScore + EntityMatchNLI
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchNLI --rl-weights 0.01,0.495,0.495 --entity-match mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emnli
A training result can be checked with TensorBoard.
$ tensorboard --logdir out_m2trans_nll-bs-emnli/log
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.0.0 at http://localhost:6006/ (Press CTRL+C to quit)
NOTE: This evaluation assumes that CheXbert is set up in ./CheXbert
.
First, extract reference reports to a csv file.
$ python extract_reports.csv MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/mimic_cxr_sectioned.csv.gz MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/mimic-cxr-2.0.0-split.csv.gz mimic-imp
$ mv mimic-imp CheXbert/src/
Second, convert generated reports to a csv file. (TEST_SAMPLES is a path to test samples. e.g., out_m2trans_nll-bs-emnli/test_31-152173_samples.txt.gz
)
$ python convert_generated.py TEST_SAMPLES gen.csv
$ mv gen.csv CheXbert/src/
Third, run CheXbert against the reference reports.
$ cd CheXbert/src/
$ python label.py -d mimic-imp/reports.csv -o mimic-imp -c chexbert.pth
Fourth, run eval_prf.py
to obtain CheXbert scores.
$ cp ../../eval_prf.py .
$ python eval_prf.py mimic-imp gen.csv gen_chex.csv
2947 references
2347 generated
...
5-micro x.xxx x.xxx x.xxx
5-acc x.xxx
An inference from a checkpoint can be done with infer.py
. (CHECKPOINT is a path to the checkpoint)
$ python infer.py --cuda --corpus mimic-cxr --cache-data cache --batch-size 24 --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT CHECKPOINT resources/glove_mimic-cxr_train.512.txt.gz out_infer
Pre-trained checkpoints for M2 Transformer can be obtained with a download script.
$ cd checkpoints
$ ./download.sh
The RadNLI pseudo training data can be made with make_radnli-pseudo-train.py
.
$ python make_radnli-pseudo-train.py MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/mimic_cxr_sectioned.csv.gz
See LICENSE and clinicgen/external/LICENSE_bleu-cider-rouge-spice for details.