Based on the PMC-15 dataset, we utilized gpt-4-vision-preview API to generate multi-round QA instructional data and conducted a two-stage clinician preference alignment process, selecting 60K and 80K language-image instruction-following samples. Additionally, we combined the filtered 80K samples with 10K and 60K samples provided by LLaVA-Med, resulting in a larger dataset of 150K samples (80K+10K+60K). We also offer an intermediate dataset of 60K samples that only incorporates the second stage of preference distillation, merging these to form a dataset of 210K samples (80K+10K+60K+60K). [HuggingFace Dataset].
Data file name | File Size | Sample Size |
---|---|---|
BioMed-VITAL-instructions-60K.json | 127 MB | 60K |
BioMed-VITAL-instructions-80K.json | 156 MB | 80K |
BioMed-VITAL-instructions-150K.json | 309 MB | 60K + 10K + 80K |
BioMed-VITAL-instructions-210K.json | 463 MB | 80K + 10K + 60K + 60K |
You can download the original images from the following link:
Data file name | File Size |
---|---|
PMC_image_urls.jsonl | 129 MB |
@misc{cui2024biomedical,
title={Biomedical Visual Instruction Tuning with Clinician Preference Alignment},
author={Hejie Cui and Lingjun Mao and Xin Liang and Jieyu Zhang and Hui Ren and Quanzheng Li and Xiang Li and Carl Yang},
year={2024},
booktitle={NeurIPS},
}
This website is adapted from Nerfies, licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. We thank the LLaVA and LLaVA-Med team for giving us access to their models, and open-source projects, including BioMed-CLIP.
Usage and License Notices: The data, code and checkpoint is intended and licensed for research use only. They are also restricted to uses that follow the license agreement of CLIP, LLaVA and GPT-4. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.
The source code of this repository is released under the Apache License 2.0. The model license and dataset license are listed on their corresponding webpages.