Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆358Oct 22, 2024Updated last year
Alternatives and similar repositories for evo-memory
Users that are interested in evo-memory are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CycleQD is a framework for parameter space model merging.☆49Feb 1, 2025Updated last year
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated last year
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆197Jun 13, 2024Updated 2 years ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,215Jan 30, 2025Updated last year
- Automating the Search for Artificial Life with Foundation Models!☆473Oct 23, 2025Updated 7 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"☆122Oct 6, 2025Updated 8 months ago
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,425Nov 29, 2024Updated last year
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆29Jul 23, 2025Updated 10 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆377Dec 12, 2024Updated last year
- Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP☆11Jan 29, 2024Updated 2 years ago
- ☆16Jul 16, 2024Updated last year
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆149Feb 25, 2026Updated 3 months ago
- Fluid Language Model Benchmarking☆30Sep 16, 2025Updated 8 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆31May 19, 2026Updated 3 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 9 months ago
- Various agents from all of the top agent frameworks to integrate into swarms! Langchain, Griptape, CrewAI, and more!☆18Dec 22, 2025Updated 5 months ago
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆32Apr 9, 2025Updated last year
- Tools for merging pretrained large language models.☆7,126May 6, 2026Updated last month
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment☆16Aug 6, 2024Updated last year
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆78Dec 26, 2024Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated 2 years ago
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆18Dec 13, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,940Dec 29, 2025Updated 5 months ago
- DeMo: Decoupled Momentum Optimization☆201Dec 2, 2024Updated last year
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆64Oct 9, 2024Updated last year
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆259Jan 31, 2025Updated last year
- OMNI: Open-endedness via Models of human Notions of Interestingness☆65Jan 28, 2025Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆287Oct 28, 2025Updated 7 months ago
- Repo to reproduce the First-Explore paper results☆39May 6, 2026Updated last month
- Evaluating majors LLMs on the Abstraction and Reasoning Corpus☆17Nov 9, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆93Aug 18, 2024Updated last year
- train with kittens!☆65Oct 25, 2024Updated last year
- ☆15Jul 5, 2024Updated last year
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆171Aug 25, 2025Updated 9 months ago
- ☆15Mar 2, 2025Updated last year
- Entropy Based Sampling and Parallel CoT Decoding☆3,435Nov 13, 2024Updated last year
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆175Jan 16, 2025Updated last year