bingreeky / AgenTracerLinks
AgenTracer: A Lightweight Failure Attributor for Agentic Systems
☆60Updated last month
Alternatives and similar repositories for AgenTracer
Users that are interested in AgenTracer are comparing it to the libraries listed below
Sorting:
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆67Updated 7 months ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆80Updated last month
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆31Updated 6 months ago
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆230Updated 2 weeks ago
- ☆52Updated 9 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆136Updated 9 months ago
- ☆26Updated last year
- ☆86Updated 3 months ago
- Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework☆141Updated 2 weeks ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆59Updated 6 months ago
- ☆45Updated 3 months ago
- ☆182Updated last month
- ☆33Updated 6 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆33Updated last year
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆66Updated 6 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆293Updated last month
- ☆38Updated 3 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆123Updated 8 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆93Updated last month
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆91Updated last month
- ☆95Updated last month
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆63Updated 5 months ago
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆117Updated 3 weeks ago
- ☆171Updated last week
- SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆117Updated 3 weeks ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆152Updated 5 months ago
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆46Updated 5 months ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆291Updated 2 months ago
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆103Updated 2 months ago
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆51Updated 4 months ago