GoodAI / goodai-ltm-benchmarkLinks
A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you need to evaluate your own agents. See more in the blogpost:
☆74Updated 6 months ago
Alternatives and similar repositories for goodai-ltm-benchmark
Users that are interested in goodai-ltm-benchmark are comparing it to the libraries listed below
Sorting:
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆84Updated 9 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 5 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆67Updated last year
- Mixing Language Models with Self-Verification and Meta-Verification☆106Updated 7 months ago
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆124Updated last year
- Just a bunch of benchmark logs for different LLMs☆119Updated 11 months ago
- Track the progress of LLM context utilisation☆55Updated 3 months ago
- A framework for optimizing DSPy programs with RL☆89Updated this week
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆82Updated 3 months ago
- ☆50Updated 2 weeks ago
- accompanying material for sleep-time compute paper☆97Updated 2 months ago
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.☆125Updated 11 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆42Updated 2 months ago
- EcoAssistant: using LLM assistant more affordably and accurately☆132Updated last year
- ☆66Updated last year
- Automating enterprise workflows with multimodal agents☆108Updated 9 months ago
- ☆47Updated last year
- LLM reads a paper and produce a working prototype☆58Updated 3 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆71Updated 7 months ago
- ☆96Updated 10 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 6 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆62Updated 11 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆50Updated 2 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆173Updated 4 months ago
- Train your own SOTA deductive reasoning model☆96Updated 4 months ago
- A strongly typed Python DSL for developing message passing multi agent systems☆53Updated last year
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆115Updated 5 months ago
- ☆74Updated last year
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆58Updated 7 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆140Updated 4 months ago