HanjieChen/ChallengeClinicalQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HanjieChen/ChallengeClinicalQA)

HanjieChen / ChallengeClinicalQA

Repo for the pape Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions

☆50

Alternatives and similar repositories for ChallengeClinicalQA

Users that are interested in ChallengeClinicalQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UCSC-VLAA / m1
View on GitHub
[ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models
☆51Dec 21, 2025Updated 7 months ago
TsinghuaC3I / MedXpertQA
View on GitHub
[ICML 2025] MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
☆171Jul 17, 2025Updated last year
XZhang97666 / AlpaCare
View on GitHub
☆94Feb 8, 2025Updated last year
KatherLab / EAGLE
View on GitHub
Efficient Approach for Guided Local Examination in Digital Pathology
☆44Apr 26, 2026Updated 3 months ago
SamuelSchmidgall / AgentClinic
View on GitHub
Agent benchmark for medical diagnosis
☆339Dec 31, 2024Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
mitmedialab / medical_hallucination
View on GitHub
Medical Hallucination in Foundation Models and Their Impact on Healthcare (2025)
☆83Nov 5, 2025Updated 8 months ago
dmis-lab / SeqTagQA
View on GitHub
Sequence Tagging for Biomedical Extractive Question Answering (Bioinformatics'2020)
☆10Jul 3, 2023Updated 3 years ago
McGill-NLP / feedbackqa
View on GitHub
FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback
☆12Jul 13, 2022Updated 4 years ago
EmilyAlsentzer / rare-disease-simulation
View on GitHub
Simulate patients with rare genetic conditions
☆24Jul 28, 2023Updated 3 years ago
ronakdm / ml-interviews
View on GitHub
Guide to interviewing for industry machine learning roles (data/applied/research scientist, ML engineer, etc).
☆12Dec 28, 2022Updated 3 years ago
ampersandmcd / DeepExtremeMixtureModel
View on GitHub
Official code release for Deep Extreme Mixture Model by Wilson, McDonald, Galib, Tan, and Luo.
☆10Feb 11, 2022Updated 4 years ago
MaksymPetyak / medplexity
View on GitHub
Evaluating LLMs for medical applications
☆15Nov 30, 2023Updated 2 years ago
kevinwu23 / Stanford-MedCaseReasoning
View on GitHub
☆51Jun 2, 2025Updated last year
ncbi-nlp / MedCalc-Bench
View on GitHub
[NeurIPS 2024 Datasets and Benchmark Track Oral] MedCalc-Bench: Evaluating Large Language Models for Medical Calculations
☆93Dec 18, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AQ-MedAI / LiveClin
View on GitHub
LiveClin is a live benchmark designed for the faithful replication of clinical practice
☆16Feb 27, 2026Updated 5 months ago
SPIRAL-MED / Ophiuchus
View on GitHub
☆41Jan 14, 2025Updated last year
UCSC-VLAA / MedReason
View on GitHub
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
☆280Jun 19, 2025Updated last year
aiueola / wsdm2022-cascade-dr
View on GitHub
(WSDM2022 Best Paper Award Runner-Up) "Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model"
☆13Jul 16, 2023Updated 3 years ago
ademakdogan / plant_detector
View on GitHub
PlantDetector provides easy development (training and prediction) for object detection. DETR (End-to-End Object Detection with Transforme…
☆11Aug 1, 2022Updated 3 years ago
krmdmn / ConvUNeXt
View on GitHub
☆10Mar 24, 2022Updated 4 years ago
xguo7 / Automatic-Controllable-Product-Copywriting-for-E-Commerce
View on GitHub
☆16Nov 3, 2022Updated 3 years ago
DATEXIS / AMEGA-benchmark
View on GitHub
AMEGA-LLM: Autonomous Medical Evaluation for Guideline Adherence of Large Language Models
☆31Jun 10, 2026Updated last month
Toadoum / YouVersionBible-data-crawler-for-NMT
View on GitHub
☆11May 30, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
SPIRAL-MED / DiagnosisArena
View on GitHub
☆33Jun 26, 2026Updated last month
eval4nlp / SharedTask2023
View on GitHub
☆11Jul 6, 2024Updated 2 years ago
UCSC-VLAA / o1_medical
View on GitHub
☆48Feb 26, 2025Updated last year
KCL-BMEIS / ScribbleDA
View on GitHub
☆22Jun 18, 2021Updated 5 years ago
dmis-lab / ANGEL
View on GitHub
Learning from Negative samples for Biomedical Generative Entity Linking
☆18May 25, 2025Updated last year
kaiko-ai / Midnight
View on GitHub
Midnight - Pathology foundation models trained on orders of magnitude fewer WSIs
☆38Nov 22, 2025Updated 8 months ago
yangkevin2 / emnlp2020-stream-beam-mt
View on GitHub
☆13Oct 17, 2020Updated 5 years ago
alejandro-lozano-dev / open_clip_with_biomedica
View on GitHub
[CVPR 2025] Custom Open CLIP repo to train biomedical CLIP models
☆38Mar 23, 2025Updated last year
ChenyuHeidiZhang / VL-commonsense
View on GitHub
☆14May 23, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ruyimarone / data-portraits
View on GitHub
Documenting large text datasets 🖼️ 📚
☆14Dec 17, 2024Updated last year
cf020031308 / LinkDist
View on GitHub
Distillation Self-Knowledge From Contrastive Links to Classify Graph Nodes Without Passing Messages.
☆15Jun 17, 2021Updated 5 years ago
dmis-lab / ETHIC
View on GitHub
[NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
☆16Sep 2, 2025Updated 10 months ago
willsheffler / worms
View on GitHub
Protein Origami via Genetic Fusions
☆18Sep 19, 2022Updated 3 years ago
favour-nerrise / xGW-GAT
View on GitHub
An Explainable Geometric-Weighted Graph Attention Network (xGW-GAT) for Identifying Functional Networks Associated with Gait Impairment
☆17Dec 5, 2024Updated last year
DeweiHu / AdaptDiff
View on GitHub
Weak conditional diffusion for domain adaptation
☆12Nov 4, 2024Updated last year
ExaNLP / sket
View on GitHub
This repository contains the source code for the Semantic Knowledge Extractor Tool (SKET). SKET is an unsupervised hybrid knowledge extra…
☆13Apr 18, 2023Updated 3 years ago