MemTensor / HaluMemLinks
HaluMem is the first operation level hallucination evaluation benchmark tailored to agent memory systems.
☆58Updated last week
Alternatives and similar repositories for HaluMem
Users that are interested in HaluMem are comparing it to the libraries listed below
Sorting:
- PGRAG☆51Updated last year
- Data Synthesis for Deep Research Based on Semi-Structured Data☆179Updated last week
- DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL☆204Updated last month
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆62Updated 4 months ago
- A Comprehensive Library for Memory of LLM-based Agents.☆89Updated 6 months ago
- HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model☆30Updated 9 months ago
- Data and Code for EMNLP 2025 Findings Paper "MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search"☆79Updated 2 weeks ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆282Updated last month
- [ICLR 2025] This is the code repo for our ICLR’25 paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rew…☆47Updated 9 months ago
- SSRL: Self-Search Reinforcement Learning☆152Updated 3 months ago
- ☆346Updated last year
- Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs☆34Updated last year
- [EMNLP 2024] OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs.☆148Updated last year
- ☆80Updated last year
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆66Updated 6 months ago
- ☆88Updated 11 months ago
- The official Github repository for paper "R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation" (EMNLP 2024 Fin…☆38Updated 11 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆112Updated 5 months ago
- The evaluation benchmark on MCP servers☆223Updated 2 months ago
- ☆72Updated last month
- The official repo for the code and data of paper SMART☆37Updated 9 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆133Updated last year
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"☆77Updated 11 months ago
- [NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆80Updated 2 months ago
- Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)☆278Updated 3 weeks ago
- The raw UserRL repo under construction☆76Updated last month
- Highly Efficient Query Rewriter for Passage Retrieval in the realm of Retrieval-Augmented Generation (RAG)☆29Updated 6 months ago
- [WWW 2025] A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System.☆127Updated 3 months ago
- This is the code repo for our paper "Enhancing Knowledge Integration and Utilization of Large Language Models via Constructivist Cognitio…☆110Updated last month
- Test-time compute in information retrieval☆46Updated 4 months ago