wshi83 / MedAgentGymLinks
This is the official repository for paper "MedAgentGYM: Training LLM Agents for Code-Based Medical Reasoning at Scale"
☆30Updated this week
Alternatives and similar repositories for MedAgentGym
Users that are interested in MedAgentGym are comparing it to the libraries listed below
Sorting:
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆35Updated 3 months ago
- ☆37Updated 6 months ago
- ☆48Updated 4 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆46Updated 2 months ago
- [NeurIPS 2024 D&B Track, Spotlight] UltraMedical: Building Specialized Generalists in Biomedicine☆89Updated 9 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆74Updated 3 months ago
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆60Updated 2 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆113Updated 3 weeks ago
- ☆27Updated 5 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.☆16Updated 3 months ago
- ☆17Updated 3 months ago
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]☆59Updated 6 months ago
- A trainable user simulator☆34Updated 2 weeks ago
- ☆14Updated 6 months ago
- Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆54Updated last week
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆115Updated 8 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆137Updated last week
- The official Github repository for paper "R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation" (EMNLP 2024 Fin…☆33Updated 7 months ago
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents☆190Updated 3 weeks ago
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆40Updated last month
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆81Updated last month
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆38Updated 4 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆116Updated last year
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆79Updated last month
- Official repository for RAG-Gym☆106Updated 4 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆57Updated 8 months ago
- The official implementation of SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning☆16Updated 2 months ago
- ☆44Updated 3 weeks ago
- Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards☆35Updated 3 weeks ago
- ☆19Updated 4 months ago