Medical Thinking with Multiple Images
MedThinkVQA is a multi-image radiology reasoning dataset focused on evidence extraction, cross-view synthesis, and diagnostic quality.
Explore Leaderboard Overview Code Data Submit
Status: Train and test splits are available on Hugging Face.
This dataset is for research/education only (CC BY-NC-SA 4.0) and is not for clinical use.
This dataset is for research/education only (CC BY-NC-SA 4.0) and is not for clinical use.
SourceBenchmark
Best Model-
Entries-
Last Updated (UTC)-
News
- [2026-04-05] Train and test splits were released on Hugging Face.
- [2026-04-05] Public dataset link updated to the canonical Hugging Face page.
- [2026-03-04] Leaderboard and submission flow are live.
Leaderboard
Loading...
Default view shows all benchmark ACC scores. Scroll vertically to browse all rows.
About
MedThinkVQA evaluates multi-image diagnostic reasoning quality with a focus on final diagnostic accuracy.
Data are adapted from Eurorad (European Society of Radiology) and released under CC BY-NC-SA 4.0 for research/education only. Not for clinical use.