SakanaAI / evo-memoryLinks
Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆311Updated 8 months ago
Alternatives and similar repositories for evo-memory
Users that are interested in evo-memory are comparing it to the libraries listed below
Sorting:
- PyTorch implementation of models from the Zamba2 series.☆182Updated 4 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆337Updated 6 months ago
- ☆178Updated 6 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆649Updated 2 weeks ago
- GRadient-INformed MoE☆263Updated 8 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 5 months ago
- ☆114Updated 5 months ago
- prime-rl is a codebase for decentralized async RL training at scale☆341Updated this week
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆311Updated last month
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,100Updated 4 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆198Updated 11 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆108Updated last month
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆239Updated 4 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆140Updated 4 months ago
- Code for ExploreTom☆83Updated 6 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆329Updated this week
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆486Updated last month
- Train your own SOTA deductive reasoning model☆94Updated 3 months ago
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆498Updated this week
- smolLM with Entropix sampler on pytorch☆150Updated 7 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆315Updated 7 months ago
- Self-Adapting Language Models☆430Updated this week
- Build your own visual reasoning model☆381Updated this week
- EvaByte: Efficient Byte-level Language Models at Scale☆101Updated 2 months ago
- DeMo: Decoupled Momentum Optimization☆188Updated 6 months ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆881Updated last month
- Getting crystal-like representations with harmonic loss☆189Updated 2 months ago
- ☆132Updated 10 months ago
- Fast parallel LLM inference for MLX☆192Updated 11 months ago
- Official repository for "DynaSaur: Large Language Agents Beyond Predefined Actions"☆343Updated 6 months ago