Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆357Oct 22, 2024Updated last year
Alternatives and similar repositories for evo-memory
Users that are interested in evo-memory are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CycleQD is a framework for parameter space model merging.☆49Feb 1, 2025Updated last year
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated last year
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆196Jun 13, 2024Updated last year
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,212Jan 30, 2025Updated last year
- Automating the Search for Artificial Life with Foundation Models!☆470Oct 23, 2025Updated 7 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,420Nov 29, 2024Updated last year
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆29Jul 23, 2025Updated 10 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆376Dec 12, 2024Updated last year
- Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP☆11Jan 29, 2024Updated 2 years ago
- ☆16Jul 16, 2024Updated last year
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆150Feb 25, 2026Updated 3 months ago
- Fluid Language Model Benchmarking☆29Sep 16, 2025Updated 8 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆31Updated this week
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Various agents from all of the top agent frameworks to integrate into swarms! Langchain, Griptape, CrewAI, and more!☆18Dec 22, 2025Updated 5 months ago
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆32Apr 9, 2025Updated last year
- Tools for merging pretrained large language models.☆7,083May 6, 2026Updated 2 weeks ago
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment☆16Aug 6, 2024Updated last year
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆76Dec 26, 2024Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated 2 years ago
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆18Dec 13, 2024Updated last year
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,874Dec 29, 2025Updated 4 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- DeMo: Decoupled Momentum Optimization☆201Dec 2, 2024Updated last year
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆64Oct 9, 2024Updated last year
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆258Jan 31, 2025Updated last year
- OMNI: Open-endedness via Models of human Notions of Interestingness☆65Jan 28, 2025Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆286Oct 28, 2025Updated 6 months ago
- Repo to reproduce the First-Explore paper results☆39May 6, 2026Updated 2 weeks ago
- Evaluating majors LLMs on the Abstraction and Reasoning Corpus☆17Nov 9, 2023Updated 2 years ago
- ☆92Aug 18, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆14Jul 5, 2024Updated last year
- train with kittens!☆66Oct 25, 2024Updated last year
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆171Aug 25, 2025Updated 9 months ago
- ☆15Mar 2, 2025Updated last year
- Entropy Based Sampling and Parallel CoT Decoding☆3,435Nov 13, 2024Updated last year
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆176Jan 16, 2025Updated last year
- HGRN2: Gated Linear RNNs with State Expansion☆57Aug 20, 2024Updated last year