A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you need to evaluate your own agents. See more in the blogpost:
☆83Dec 17, 2024Updated last year
Alternatives and similar repositories for goodai-ltm-benchmark
Users that are interested in goodai-ltm-benchmark are comparing it to the libraries listed below
Sorting:
- A Python library for long-term memory in language models. Improve conversational scenarios and create autonomous learning agents with enh…☆42Feb 5, 2025Updated last year
- LMQL implementation of tree of thoughts☆36Jan 31, 2024Updated 2 years ago
- EmotionCircuits-LLM: A complete, reproducible framework for discovering and controlling emotion circuits in large language models.☆25Oct 20, 2025Updated 4 months ago
- ☆10Nov 6, 2024Updated last year
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- A prototype agent with the purpose of evaluating the performance of a Large Language Model within a python terminal.☆13Aug 28, 2023Updated 2 years ago
- ☆12Mar 18, 2024Updated last year
- code for training and using chess embeddings models☆13Jun 9, 2024Updated last year
- Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner☆30Jun 27, 2024Updated last year
- Benchmarking LLM Inference Speeds☆13Feb 4, 2026Updated last month
- POC integration Airbyte+Dagster+Langchain☆13Jun 1, 2023Updated 2 years ago
- Track the progress of LLM context utilisation☆55Apr 14, 2025Updated 10 months ago
- LLM-Powered Data Discovery System for Tabular Data☆24Jul 14, 2025Updated 7 months ago
- Salesforce AI Research's open diffusion language model☆59Oct 29, 2025Updated 4 months ago
- Design a full DOF humanoid robot for research use incrementally, halving the cost until the BOM is $1000.☆21May 30, 2025Updated 9 months ago
- Fork of Flame repo for training of some new stuff in development☆19Updated this week
- Training GPTs to solve interaction nets☆18Aug 14, 2024Updated last year
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆19Dec 27, 2024Updated last year
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Jun 3, 2024Updated last year
- ☆19Aug 7, 2024Updated last year
- An extensible CLI for integrating LLM models with a flexible scripting system☆22Jul 1, 2024Updated last year
- paris - world's first decentralized trained open-weight diffusion model☆53Oct 7, 2025Updated 4 months ago
- Code repo for CLERC: A Legal Precedent Dataset for Case Retrieval and Retrieval-Augmented Analysis Generation (NAACL 2025)☆25Jan 28, 2025Updated last year
- ☆87Dec 15, 2023Updated 2 years ago
- Because every team needs a townie! Enjoy ChatGPT in your Slack workspace☆22Jan 31, 2024Updated 2 years ago
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆27May 16, 2025Updated 9 months ago
- The next evolution of Agents☆48Feb 9, 2026Updated 3 weeks ago
- Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.☆25Jan 23, 2024Updated 2 years ago
- ☆63Oct 3, 2024Updated last year
- Code for the 2025 ACL publication "Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs"☆32Jun 25, 2025Updated 8 months ago
- train with kittens!☆63Oct 25, 2024Updated last year
- ☆11Oct 27, 2022Updated 3 years ago
- Verbosity control for AI agents☆66May 23, 2024Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Apr 9, 2025Updated 10 months ago
- ☆32Jul 12, 2024Updated last year
- Ace interviews with AI practice. Our agent role-plays personalized interview tailored to your background, listening and replying like a r…☆123Jun 9, 2024Updated last year
- ☆32Jul 8, 2024Updated last year
- ChatBot Plus is a chat interface for large language models which allows you to chat with OpenAI, Cohere, Together, or Local (Alpaca, Llam…☆32Apr 10, 2023Updated 2 years ago
- The repository contains generative AI analytics platform application code.☆29Sep 25, 2025Updated 5 months ago