Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆349Oct 22, 2024Updated last year
Alternatives and similar repositories for evo-memory
Users that are interested in evo-memory are comparing it to the libraries listed below
Sorting:
- CycleQD is a framework for parameter space model merging.☆48Feb 1, 2025Updated last year
- Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"☆120Oct 6, 2025Updated 4 months ago
- Automating the Search for Artificial Life with Foundation Models!☆450Oct 23, 2025Updated 4 months ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,189Jan 30, 2025Updated last year
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆14Apr 30, 2025Updated 10 months ago
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆192Jun 13, 2024Updated last year
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,399Nov 29, 2024Updated last year
- ☆16Jul 16, 2024Updated last year
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆372Dec 12, 2024Updated last year
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆140Updated this week
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- The original Shared Recurrent Memory Transformer implementation☆33Jul 11, 2025Updated 7 months ago
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆18Dec 13, 2024Updated last year
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment☆16Aug 6, 2024Updated last year
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆280Oct 28, 2025Updated 4 months ago
- DeMo: Decoupled Momentum Optimization☆198Dec 2, 2024Updated last year
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆251Jan 31, 2025Updated last year
- Various agents from all of the top agent frameworks to integrate into swarms! Langchain, Griptape, CrewAI, and more!☆18Dec 22, 2025Updated 2 months ago
- ☆15Mar 2, 2025Updated last year
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,775Dec 29, 2025Updated 2 months ago
- This repository is the official implementation of the TRAC optimizer in Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement …☆32May 2, 2025Updated 10 months ago
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆27Jul 23, 2025Updated 7 months ago
- Tools for merging pretrained large language models.☆6,814Jan 26, 2026Updated last month
- HGRN2: Gated Linear RNNs with State Expansion☆56Aug 20, 2024Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆203Jul 17, 2024Updated last year
- ☆14Mar 28, 2024Updated last year
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆156Apr 7, 2025Updated 10 months ago
- VPTQ, A Flexible and Extreme low-bit quantization algorithm☆674Apr 25, 2025Updated 10 months ago
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆63Oct 9, 2024Updated last year
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆169Aug 25, 2025Updated 6 months ago
- ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆226Sep 18, 2025Updated 5 months ago
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆73Dec 26, 2024Updated last year
- Entropy Based Sampling and Parallel CoT Decoding☆3,434Nov 13, 2024Updated last year
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆87Sep 12, 2025Updated 5 months ago
- Official repository for the paper ''ambigram generation by a diffusion model''.☆16Aug 9, 2023Updated 2 years ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆28Aug 19, 2025Updated 6 months ago
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- train with kittens!☆63Oct 25, 2024Updated last year
- Official repo of paper LM2☆47Feb 13, 2025Updated last year