GoodAI / goodai-ltm-benchmark
A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you need to evaluate your own agents. See more in the blogpost:
☆64Updated 2 months ago
Alternatives and similar repositories for goodai-ltm-benchmark:
Users that are interested in goodai-ltm-benchmark are comparing it to the libraries listed below
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆89Updated 3 weeks ago
- ☆65Updated 8 months ago
- ☆48Updated last year
- ☆48Updated 3 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 6 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆46Updated last month
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆71Updated 4 months ago
- ☆80Updated last month
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆24Updated 7 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆58Updated 7 months ago
- ☆74Updated last year
- LLM reads a paper and produce a working prototype☆48Updated 2 weeks ago
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated 9 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆130Updated this week
- LMQL implementation of tree of thoughts☆33Updated last year
- Simple examples using Argilla tools to build AI☆53Updated 3 months ago
- Track the progress of LLM context utilisation☆53Updated 7 months ago
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆116Updated 8 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆64Updated 7 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆100Updated 2 months ago
- This repository explains and provides examples for "concept anchoring" in GPT4.☆72Updated last year
- Functional Benchmarks and the Reasoning Gap☆82Updated 4 months ago
- ☆46Updated 10 months ago
- Evaluating LLMs with CommonGen-Lite☆88Updated 11 months ago
- Code for ExploreTom☆75Updated 2 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆61Updated 2 months ago
- Simple Graph Memory for AI applications☆81Updated 6 months ago
- ☆87Updated last year