AgenTracer: A Lightweight Failure Attributor for Agentic Systems
☆79Nov 12, 2025Updated 3 months ago
Alternatives and similar repositories for AgenTracer
Users that are interested in AgenTracer are comparing it to the libraries listed below
Sorting:
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- ☆20Updated this week
- ☆15May 14, 2025Updated 9 months ago
- This is the official code repository for the paper: Towards General Continuous Memory for Vision-Language Models.☆21Jul 3, 2025Updated 8 months ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆17May 27, 2024Updated last year
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated last week
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated 2 weeks ago
- ☆18Nov 3, 2025Updated 4 months ago
- ☆26May 13, 2025Updated 9 months ago
- ☆17Feb 4, 2025Updated last year
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆20Jun 13, 2025Updated 8 months ago
- This repository contains code and datasets for our paper on the effects of document multiplicity while the context size is fixed in Retri…☆18Mar 13, 2025Updated 11 months ago
- Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning☆28Jul 14, 2025Updated 7 months ago
- Defect Library for LLM-enabled Software☆23Dec 31, 2025Updated 2 months ago
- ☆23Jul 2, 2025Updated 8 months ago
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆23Oct 22, 2025Updated 4 months ago
- ☆25Nov 19, 2025Updated 3 months ago
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Jul 17, 2025Updated 7 months ago
- ☆46Jun 24, 2025Updated 8 months ago
- (ACL 2025 Main) Code for MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents https://www.arxiv.org/pdf/2503.019…☆38Jun 21, 2025Updated 8 months ago
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models☆33May 21, 2025Updated 9 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- [ICLR 2025]ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning https://arxiv.org/abs/2501.06590☆79Jul 31, 2025Updated 7 months ago
- PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)☆30Jun 14, 2024Updated last year
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Apr 7, 2025Updated 11 months ago
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- Sotopia-RL: Reward Design for Social Intelligence☆46Jan 29, 2026Updated last month
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding"☆59Jan 23, 2026Updated last month
- Auditing agents for fine-tuning safety☆20Oct 21, 2025Updated 4 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models☆10Dec 21, 2025Updated 2 months ago
- (ICLR 2026) Optimas: Optimizing Compound AI Systems☆75Feb 6, 2026Updated last month
- ☆40Aug 6, 2025Updated 7 months ago
- MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment☆35Jul 1, 2024Updated last year
- [CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents☆59May 26, 2025Updated 9 months ago
- ☆23Feb 10, 2026Updated 3 weeks ago
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆39Sep 22, 2024Updated last year
- Clone of JSAI static analysis framework☆13Jul 29, 2017Updated 8 years ago
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆42Feb 27, 2026Updated last week