SakanaAI / evo-memory
Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
β273Updated 2 months ago
Alternatives and similar repositories for evo-memory:
Users that are interested in evo-memory are comparing it to the libraries listed below
- A Self-adaptation Frameworkπ that adapts LLMs for unseen tasks in real-time!β343Updated this week
- PyTorch implementation of models from the Zamba2 series.β166Updated last month
- smolLM with Entropix sampler on pytorchβ147Updated 2 months ago
- β96Updated 3 weeks ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsβ¦β277Updated last month
- Fast parallel LLM inference for MLXβ152Updated 6 months ago
- Open weights language model from Google DeepMind, based on Griffin.β614Updated 6 months ago
- Training Large Language Model to Reason in a Continuous Latent Spaceβ388Updated this week
- Visualize the intermediate output of Mistral 7Bβ333Updated 11 months ago
- DeMo: Decoupled Momentum Optimizationβ170Updated last month
- β152Updated last month
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.β157Updated this week
- code for training & evaluating Contextual Document Embedding modelsβ160Updated this week
- Alice in Wonderland code base for experiments and raw experiments dataβ110Updated 3 months ago
- GRadient-INformed MoEβ261Updated 3 months ago
- Code for TrackTheMindβ67Updated last month
- Long context evaluation for large language modelsβ195Updated this week
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"β831Updated last month
- Draw more samplesβ182Updated 6 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β188Updated 6 months ago
- β122Updated 4 months ago
- β115Updated this week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)β298Updated 3 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"β154Updated 3 months ago
- A compact LLM pretrained in 9 days by using high quality dataβ279Updated last month
- β96Updated 3 months ago
- A library for making RepE control vectorsβ528Updated last week
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"β278Updated last month
- An Open Source Toolkit For LLM Distillationβ425Updated last week
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fullβ¦β605Updated last month