bingreeky / AgenTracerLinks
AgenTracer: A Lightweight Failure Attributor for Agentic Systems
☆53Updated 2 weeks ago
Alternatives and similar repositories for AgenTracer
Users that are interested in AgenTracer are comparing it to the libraries listed below
Sorting:
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆65Updated 4 months ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆76Updated 3 weeks ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆33Updated last year
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆52Updated 10 months ago
- ☆37Updated last month
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆30Updated 4 months ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆56Updated 3 months ago
- ☆127Updated last month
- ☆50Updated 7 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆130Updated 7 months ago
- ☆78Updated last month
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆114Updated 5 months ago
- [NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆122Updated 3 weeks ago
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆95Updated this week
- [ICLR 2025] This is the code repo for our ICLR’25 paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rew…☆46Updated 8 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆82Updated 4 months ago
- ☆104Updated 10 months ago
- ☆89Updated 4 months ago
- SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆94Updated 3 weeks ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆40Updated 7 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆63Updated 4 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆88Updated 6 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆104Updated 6 months ago
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆111Updated last month
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆44Updated 3 months ago
- The code for LaRA Benchmark☆43Updated 4 months ago
- ☆38Updated last month
- ☆33Updated 4 months ago
- ☆18Updated 3 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆138Updated 3 months ago