DATEXIS / AMEGA-benchmarkLinks
AMEGA-LLM: Autonomous Medical Evaluation for Guideline Adherence of Large Language Models
☆20Updated last month
Alternatives and similar repositories for AMEGA-benchmark
Users that are interested in AMEGA-benchmark are comparing it to the libraries listed below
Sorting:
- Repo for the pape Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions☆42Updated 2 months ago
- Official implementation for NeurIPS'24 paper: MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making☆195Updated 10 months ago
- MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning☆57Updated 2 months ago
- [NeurIPS 2024 Datasets and Benchmark Track Oral] MedCalc-Bench: Evaluating Large Language Models for Medical Calculations☆73Updated 3 weeks ago
- ☆127Updated last year
- [NeurIPS 2022] Code for "Retrieve, Reason, and Refine: Generating Accurate and Faithful Discharge/Patient Instructions"