gersteinlab/MedicalAgentsBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gersteinlab/MedicalAgentsBench)

gersteinlab / MedicalAgentsBench

[Patterns] MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning

☆83

Alternatives and similar repositories for MedicalAgentsBench

Users that are interested in MedicalAgentsBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yhzhu99 / MedAgentBoard
View on GitHub
[NeurIPS 2025] MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks
☆59Mar 13, 2026Updated 4 months ago
ritaranx / ClinGen
View on GitHub
[ACL 2024 Findings] This is the code for our paper "Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation wi…
☆43Jun 23, 2024Updated 2 years ago
XiaoXiao-Woo / KAMAC
View on GitHub
A Knowledge-driven Adaptive Collaboration of LLMs for Enhancing Medical Decision-making
☆17Oct 23, 2025Updated 8 months ago
stanfordmlgroup / MedAgentBench
View on GitHub
MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents
☆302Nov 21, 2025Updated 8 months ago
TsinghuaC3I / MedXpertQA
View on GitHub
[ICML 2025] MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
☆170Jul 17, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yczhou001 / MAM
View on GitHub
MAM: ModularMulti-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration
☆53Apr 3, 2026Updated 3 months ago
JZPeterPan / DAS-Medical-Red-Teaming-Agents
View on GitHub
☆18Aug 17, 2025Updated 11 months ago
Google-Health / rxqa
View on GitHub
☆19Oct 30, 2025Updated 8 months ago
UCSC-VLAA / m1
View on GitHub
[ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models
☆51Dec 21, 2025Updated 7 months ago
mitmedialab / MDAgents
View on GitHub
Official implementation for NeurIPS'24 paper: MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making
☆288Nov 10, 2024Updated last year
Wangyixinxin / MMedAgent
View on GitHub
Learning to Use Medical Tools with Multi-modal Agent
☆267Mar 18, 2026Updated 4 months ago
gersteinlab / MedAgents
View on GitHub
[ACL 2024 Findings] MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning https://arxiv.org/abs/2311.10537
☆361May 27, 2024Updated 2 years ago
Schuture / Quality-Sentinel
View on GitHub
This is the repository of Quality Sentinel, a label quality evaluation model for medical image segmentation.
☆22Dec 3, 2025Updated 7 months ago
MAGIC-AI4Med / MedRBench
View on GitHub
[Nature Communications] The official code for "Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases".
☆69Nov 7, 2025Updated 8 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
jinlab-imvr / MedAgent-Pro
View on GitHub
[2026 ICLR] The official code for MedAgent_Pro
☆177May 12, 2026Updated 2 months ago
NUS-Project / Landmark-of-medical-agent
View on GitHub
☆181Jun 8, 2026Updated last month
RyanWangZf / PromptEHR
View on GitHub
EMNLP'22 | PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning
☆31Jun 8, 2023Updated 3 years ago
ZJU4HealthCare / OmniCT
View on GitHub
【ICLR 2026】 Official Repo for Paper ‘’OmniCT: Towards a Unified Slice-Volume LVLM for Comprehensive CT Analysis‘’
☆18Mar 4, 2026Updated 4 months ago
UARK-AICV / FG-CXR
View on GitHub
The repository of the ACCV 2024 paper "FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Ge…
☆12Jul 28, 2025Updated 11 months ago
SamuelSchmidgall / AgentClinic
View on GitHub
Agent benchmark for medical diagnosis
☆335Dec 31, 2024Updated last year
alibaba-damo-academy / ReasonMed
View on GitHub
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
☆121Oct 28, 2025Updated 8 months ago
barthelemymp / TULIP-TCR
View on GitHub
☆14May 15, 2024Updated 2 years ago
DDVD233 / QoQ_Med
View on GitHub
☆52Jul 31, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MrGiovanni / ScaleMAI
View on GitHub
☆24Jan 11, 2025Updated last year
ncbi-nlp / MedCalc-Bench
View on GitHub
[NeurIPS 2024 Datasets and Benchmark Track Oral] MedCalc-Bench: Evaluating Large Language Models for Medical Calculations
☆93Dec 18, 2025Updated 7 months ago
jinlab-imvr / 3DMedAgent
View on GitHub
[2026 ICML] 3DMedAgent: Unified Perception-to-Understanding for 3D Medical Analysis
☆25May 25, 2026Updated last month
Luffy03 / GF-Screen
View on GitHub
[ICLR 2026] Glance and Focus Reinforcement for Pan-cancer Screening
☆36May 14, 2026Updated 2 months ago
microsoft / HealthAgentBench
View on GitHub
☆25Updated this week
wshi83 / MedAgentGym
View on GitHub
[ICLR'26] MedAgentGYM: Training LLM Agents for Code-Based Medical Reasoning at Scale
☆124Apr 12, 2026Updated 3 months ago
KaiChenNJ / MDTeamGPT
View on GitHub
☆35Dec 12, 2025Updated 7 months ago
NUS-Project / MedMASLab
View on GitHub
☆30Mar 22, 2026Updated 4 months ago
UCSC-VLAA / MedReason
View on GitHub
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
☆280Jun 19, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ljwztc / MedChain
View on GitHub
The repository for "MedChain: Bridging the Gap Between LLM Agents and Real-World Clinical Decision Making"
☆55Apr 8, 2026Updated 3 months ago
pqpq17 / Awesome-LLM-Reasoning-on-Medicine
View on GitHub
The Official Repo for Paper: Aligning Clinical Needs and AI Capabilities: A Survey on LLMs for Medical Reasoning
☆24Apr 7, 2026Updated 3 months ago
aiming-lab / MMedPO
View on GitHub
[ICML'25] MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization
☆74Jun 5, 2025Updated last year
bowang-lab / MedRAX2
View on GitHub
MedRAX-2
☆25Apr 3, 2026Updated 3 months ago
JarvisUSTC / DoctorAgent-RL
View on GitHub
DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue
☆94Jan 23, 2026Updated 5 months ago
mhxu1998 / FlexCare
View on GitHub
KDD 2024 | FlexCare: Leveraging Cross-Task Synergy for Flexible Multimodal Healthcare Prediction
☆18Sep 4, 2024Updated last year
wshi83 / EhrAgent
View on GitHub
[EMNLP'24] EHRAgent: Code Empowers Large Language Models for Complex Tabular Reasoning on Electronic Health Records
☆137Dec 26, 2024Updated last year