Teddy-XiongGZ/MIRAGE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Teddy-XiongGZ/MIRAGE)

Teddy-XiongGZ / MIRAGE

Official repository of the MIRAGE benchmark

☆197

Alternatives and similar repositories for MIRAGE

Users that are interested in MIRAGE are comparing it to the libraries listed below

Sorting:

Teddy-XiongGZ / MedRAG
View on GitHub
Code for the MedRAG toolkit
☆519May 8, 2025Updated 10 months ago
ncbi / MedCPT
View on GitHub
Code for MedCPT, a model for zero-shot biomedical information retrieval.
☆237Mar 24, 2024Updated last year
ncbi-nlp / MedCalc-Bench
View on GitHub
[NeurIPS 2024 Datasets and Benchmark Track Oral] MedCalc-Bench: Evaluating Large Language Models for Medical Calculations
☆82Dec 18, 2025Updated 2 months ago
jind11 / MedQA
View on GitHub
Code and data for MedQA
☆361Dec 1, 2022Updated 3 years ago
pubmedqa / pubmedqa
View on GitHub
PubMedQA: A Dataset for Biomedical Research Question Answering
☆412Apr 18, 2023Updated 2 years ago
shan23chen / MedBrowseComp
View on GitHub
☆41May 22, 2025Updated 9 months ago
TsinghuaC3I / UltraMedical
View on GitHub
[NeurIPS 2024 D&B Track, Spotlight] UltraMedical: Building Specialized Generalists in Biomedicine
☆94Sep 26, 2024Updated last year
Andy-jqa / biomedical-qa-datasets
View on GitHub
Biomedical Question Answering Datasets.
☆124Apr 30, 2025Updated 10 months ago
stellalisy / mediQ
View on GitHub
☆37Jan 26, 2025Updated last year
dmis-lab / self-biorag
View on GitHub
[ISMB 2024] Self-BioRAG: Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models
☆64Apr 4, 2024Updated last year
MAGIC-AI4Med / MedS-Ins
View on GitHub
[npj digital medicine] The official codes for "Towards Evaluating and Building Versatile Large Language Models for Medicine"
☆77May 5, 2025Updated 10 months ago
passing2961 / DialogCC
View on GitHub
Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…
☆13Jun 24, 2024Updated last year
UCSC-VLAA / o1_medical
View on GitHub
☆48Feb 26, 2025Updated last year
baeseongsu / ehrxqa
View on GitHub
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images (NeurIPS 2023 D&B)
☆91Feb 6, 2026Updated last month
FreedomIntelligence / CMB
View on GitHub
CMB, A Comprehensive Medical Benchmark in Chinese
☆232Mar 27, 2025Updated 11 months ago
BARUDA-AI / Awesome-Medical-LLM
View on GitHub
Large language model of Medical AI, General Medical AI (GMAI)
☆17Jan 30, 2024Updated 2 years ago
MAGIC-AI4Med / MMedLM
View on GitHub
[Nature Communications] The official codes for "Towards Building Multilingual Language Model for Medicine"
☆276May 9, 2025Updated 10 months ago
Stanford-AIMI / chexpert-plus
View on GitHub
☆103Jun 6, 2024Updated last year
RAG-Gym / RAG-Gym
View on GitHub
Official repository for RAG-Gym
☆121Mar 4, 2025Updated last year
medmcqa / medmcqa
View on GitHub
A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.
☆261Nov 28, 2022Updated 3 years ago
RyanWangZf / LEADS
View on GitHub
A specialized LLM for study search, study screening, and data extraction from medical literature.
☆26Mar 10, 2025Updated 11 months ago
Qsingle / open-medical-r1
View on GitHub
This repository is aim to reproduce the R1-Zero on medical domain.
☆32Jun 11, 2025Updated 8 months ago
mitmedialab / MDAgents
View on GitHub
Official implementation for NeurIPS'24 paper: MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making
☆242Nov 10, 2024Updated last year
corentin-ryr / MultiMedEval
View on GitHub
A Python tool to evaluate the performance of VLM on the medical domain.
☆83Aug 5, 2025Updated 7 months ago
infi-coder / infibench-evaluator
View on GitHub
The evaluation framework for the InfiCoder-Eval benchmark.
☆21Jul 22, 2024Updated last year
ypr17 / LMKG
View on GitHub
The resources for LMKG (a large-scale, high-quality, multi-source, and multi-lingual medical knowledge graph).
☆22Sep 7, 2023Updated 2 years ago
wbw520 / DiReCT
View on GitHub
DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models (NeurIPS 2024 D&B Track)
☆23Mar 6, 2025Updated last year
TsinghuaC3I / MedXpertQA
View on GitHub
[ICML 2025] MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
☆142Jul 17, 2025Updated 7 months ago
Wangyixinxin / MMedAgent
View on GitHub
Learning to Use Medical Tools with Multi-modal Agent
☆230Feb 7, 2026Updated last month
awslabs / robustqa-acl23
View on GitHub
☆20Mar 22, 2024Updated last year
RL4M / MED-PEFT
View on GitHub
☆23Jan 16, 2024Updated 2 years ago
rayruizhiliao / mutual_info_img_txt
View on GitHub
Joint learning of images and text via maximization of mutual information
☆19Dec 14, 2021Updated 4 years ago
flageval-baai / HalluDial
View on GitHub
☆21Aug 19, 2024Updated last year
eric-ai-lab / ProbMed
View on GitHub
[ACL 2025 Findings] "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"
☆25Feb 21, 2025Updated last year
ritaranx / RAM-EHR
View on GitHub
[ACL 2024] This is the code for our paper ”RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records“.
☆41Sep 19, 2024Updated last year
awslabs / rag-qa-arena
View on GitHub
☆52Aug 14, 2024Updated last year
ritaranx / BMRetriever
View on GitHub
[EMNLP 2024] This is the code for our paper "BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers".
☆23Sep 19, 2024Updated last year
rajpurkarlab / ReXrank
View on GitHub
☆25Updated this week
dek924 / PatientSim
View on GitHub
PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions (NeurIPS 2025 D&B track, Spotlight)
☆24Feb 11, 2026Updated 3 weeks ago