MemTensor/HaluMem

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MemTensor/HaluMem)

MemTensor / HaluMem

HaluMem is the first operation level hallucination evaluation benchmark tailored to agent memory systems.

☆148

Alternatives and similar repositories for HaluMem

Users that are interested in HaluMem are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

IAAR-Shanghai / SEAP
View on GitHub
☆23Jun 10, 2025Updated last year
IAAR-Shanghai / PGRAG
View on GitHub
PGRAG
☆53Jul 16, 2024Updated 2 years ago
IAAR-Shanghai / Awesome-AI-Memory
View on GitHub
Awesome AI Memory | LLM Memory | A curated knowledge base on AI memory for LLMs and agents, covering long-term memory, reasoning, retriev…
☆1,105Jul 14, 2026Updated last week
intuit-ai-research / REMem
View on GitHub
☆29Feb 27, 2026Updated 5 months ago
IAAR-Shanghai / NewsBench
View on GitHub
[ACL 2024 Main] NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Jou…
☆34Jun 25, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
THUIR / MemoryBench
View on GitHub
Code for MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems
☆87Jun 27, 2026Updated last month
xiaowu0162 / LongMemEval
View on GitHub
Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)
☆966May 11, 2026Updated 2 months ago
IAAR-Shanghai / SafeRAG
View on GitHub
☆61Mar 11, 2025Updated last year
snap-research / locomo
View on GitHub
☆1,047Aug 13, 2024Updated last year
wangyu-ustc / Mem-alpha
View on GitHub
The official implementation of the paper "Mem-α: Learning Memory Construction via Reinforcement Learning"
☆218Dec 25, 2025Updated 7 months ago
zjunlp / LightMem
View on GitHub
[ICLR 2026] LightMem: Lightweight and Efficient Memory-Augmented Generation
☆1,033Updated this week
MemTensor / MemFactory
View on GitHub
☆37May 25, 2026Updated 2 months ago
AGI-Eval-Official / amemgym
View on GitHub
☆40Apr 7, 2026Updated 3 months ago
nemori-ai / nemori
View on GitHub
A minimalist MVP demonstrating a simple yet profound insight: aligning AI memory with human episodic memory granularity. Shows how this s…
☆207Apr 16, 2026Updated 3 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
IAAR-Shanghai / xVerify
View on GitHub
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
☆148Nov 13, 2025Updated 8 months ago
OpenDataBox / MemoryData
View on GitHub
A Unified Memory Benchmark Suite for Memory-Augmented Agents
☆120Jul 5, 2026Updated 3 weeks ago
HUST-AI-HYZ / MemoryAgentBench
View on GitHub
Open source code for ICLR 2026 Paper: Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
☆409May 21, 2026Updated 2 months ago
bowen-upenn / PersonaMem
View on GitHub
[COLM 2025] Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
☆175Mar 19, 2026Updated 4 months ago
IAAR-Shanghai / FastMem
View on GitHub
Fast Memorization of Prompt Improves Context Awareness of Large Language Models (Findings of EMNLP 2024)
☆22Oct 22, 2024Updated last year
FredJiang0324 / Anatomy-of-Agentic-Memory
View on GitHub
☆25Apr 8, 2026Updated 3 months ago
MIT-MI / MEM1
View on GitHub
☆325Jan 3, 2026Updated 6 months ago
IAAR-Shanghai / Grimoire
View on GitHub
Grimoire is All You Need for Enhancing Large Language Models
☆120Feb 29, 2024Updated 2 years ago
AvatarMemory / RealMemBench
View on GitHub
☆47Apr 7, 2026Updated 3 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
MingyuJ666 / Disentangling-Memory-and-Reasoning
View on GitHub
[ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.
☆87Nov 2, 2025Updated 8 months ago
MemTensor / skills-vote
View on GitHub
SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution
☆293Jul 7, 2026Updated 2 weeks ago
zjunlp / MemBase
View on GitHub
A Comprehensive Benchmarking Framework for Long-Term Conversational Memory Layers
☆43Jun 29, 2026Updated 3 weeks ago
IAAR-Shanghai / DATG
View on GitHub
[ACL 2024]Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs
☆40Sep 24, 2024Updated last year
zjunlp / Chat2Workflow
View on GitHub
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language
☆38May 27, 2026Updated 2 months ago
MemTensor / MemOS
View on GitHub
Self-evolving memory OS for LLM & AI Agents: ultra-persistent memory, hybrid-retrieval, and cross-task skill reuse, with 35.24% token sav…
☆10,385Updated this week
BytedTsinghua-SIA / MemAgent
View on GitHub
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
☆1,085May 12, 2026Updated 2 months ago
NeurAI-Lab / SCoMMER
View on GitHub
The official PyTorch code for AAAI'23 Paper "Sparse Coding in a Dual Memory System for Lifelong Learning"
☆12Feb 15, 2023Updated 3 years ago
Ananyaiitbhilai / Text2Triple-LLM-Agent
View on GitHub
[ESWC '24] This repo is official implementation for the paper "Towards Harnessing Large Language Models as Autonomous Agents for Semantic…
☆10May 25, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
yanweiyue / Mem-T
View on GitHub
Mem-T: Densifying Rewards for Long-Horizon Memory Agents
☆39Mar 22, 2026Updated 4 months ago
WujiangXu / A-mem
View on GitHub
The code for NeurIPS 2025 paper "A-Mem: Agentic Memory for LLM Agents"
☆929Mar 5, 2026Updated 4 months ago
IAAR-Shanghai / ICSFSurvey
View on GitHub
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasonin…
☆173Dec 7, 2024Updated last year
agiresearch / A-mem
View on GitHub
A-MEM: Agentic Memory for LLM Agents
☆1,125Dec 12, 2025Updated 7 months ago
bingreeky / MemEvolve
View on GitHub
[ICML'26] MemEvolve & EvolveLab
☆255May 5, 2026Updated 2 months ago
IAAR-Shanghai / CRUD_RAG
View on GitHub
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
☆400May 20, 2025Updated last year
whybe-choi / kovidore-benchmark
View on GitHub
[ACL'26 Workshop] KoViDoRe: Korean Visual Document Retrieval Benchmark
☆24Jul 2, 2026Updated 3 weeks ago