yuzhu-cai / rSDE-BenchLinks
☆22Updated 3 weeks ago
Alternatives and similar repositories for rSDE-Bench
Users that are interested in rSDE-Bench are comparing it to the libraries listed below
Sorting:
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆143Updated 7 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆57Updated 8 months ago
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆96Updated 2 months ago
- ☆61Updated this week
- e☆36Updated 2 months ago
- Official Implementation of Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization☆148Updated last year
- ☆28Updated 3 weeks ago
- ☆136Updated 6 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆97Updated 4 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆54Updated 6 months ago
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆59Updated 7 months ago
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆77Updated 11 months ago
- Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆155Updated this week
- [ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet☆84Updated last week
- ☆66Updated 3 months ago
- 🔥🔥🔥 ICLR 2025 Oral. Automating Agentic Workflow Generation.☆130Updated 3 weeks ago
- Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?☆28Updated 2 weeks ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆98Updated last month
- A research repo for experiments about Reinforcement Finetuning☆48Updated 2 months ago
- Open Source Implementation of Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evo…☆31Updated this week
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆60Updated last month
- Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…☆45Updated last year
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆64Updated 2 weeks ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆57Updated last year
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆227Updated 5 months ago
- [NeurIPS 2024] Agent Planning with World Knowledge Model☆141Updated 6 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆75Updated 2 weeks ago
- MPO: Boosting LLM Agents with Meta Plan Optimization☆58Updated 3 months ago
- InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks (ICML 2024)☆134Updated 3 weeks ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆26Updated last year