BlueZeros / ReflecToolLinks
Benchmark, Toolbox, and Reflection-based Method for Clinical Agent
☆12Updated last year
Alternatives and similar repositories for ReflecTool
Users that are interested in ReflecTool are comparing it to the libraries listed below
Sorting:
- DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue☆38Updated last month
- [ICML 2025] MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding☆128Updated 4 months ago
- [NeurIPS 2024 Datasets and Benchmark Track Oral] MedCalc-Bench: Evaluating Large Language Models for Medical Calculations☆76Updated last week
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆92Updated last year
- MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning☆65Updated last month
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆83Updated 11 months ago
- A Paper collection for LLM based Patient Simulators☆72Updated 2 months ago
- MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks☆32Updated last month
- Official implementation for NeurIPS'24 paper: MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making☆205Updated last year
- [npj digital medicine] The official codes for "Towards Evaluating and Building Versatile Large Language Models for Medicine"☆73Updated 6 months ago
- [Nature Communications] The official code for "Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases".☆38Updated 2 weeks ago
- [EMNLP'24] MedAdapter: Efficient Test-Time Adaptation of Large Language Models Towards Medical Reasoning☆34Updated 10 months ago
- ☆68Updated 9 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆184Updated last year
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆139Updated 6 months ago
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆44Updated 7 months ago
- The implement of paper:"ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability"☆50Updated 5 months ago
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆46Updated 4 months ago
- [arxiv'25] MedAgentGYM: Training LLM Agents for Code-Based Medical Reasoning at Scale☆66Updated 3 months ago
- ☆12Updated 8 months ago
- ☆32Updated 9 months ago
- ☆48Updated 8 months ago
- ☆59Updated last year
- ☆165Updated last month
- ☆17Updated last year
- ☆284Updated 4 months ago
- [ICLR'25] MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models☆273Updated 9 months ago
- ☆69Updated 5 months ago
- Collection of latest papers and materials in the area of RLVR!☆37Updated 3 weeks ago
- Official repository of the MIRAGE benchmark☆181Updated last year