SakanaAI / evo-memoryLinks

Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.

☆325

Alternatives and similar repositories for evo-memory

Users that are interested in evo-memory are comparing it to the libraries listed below

Sorting:

facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆342Updated 10 months ago
jerber / lang-jepa
☆124Updated 10 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆185Updated 9 months ago
microsoft / GRIN-MoE
GRadient-INformed MoE
☆264Updated last year
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆108Updated 7 months ago
facebookresearch / ExploreToM
Code for ExploreTom
☆86Updated 3 months ago
SakanaAI / text-to-lora
Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input
☆900Updated 4 months ago
google-deepmind / regress-lm
Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…
☆277Updated this week
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆172Updated 9 months ago
SinatrasC / entropix-smollm
smolLM with Entropix sampler on pytorch
☆150Updated 11 months ago
VsonicV / es-fine-tuning-paper
This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"
☆209Updated this week
microsoft / ArchScale
Simple & Scalable Pretraining for Neural Architecture Research
☆297Updated 2 months ago
jxmorris12 / cde
code for training & evaluating Contextual Document Embedding models
☆198Updated 5 months ago
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆110Updated 6 months ago
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆202Updated last year
LucasPrietoAl / grokking-at-the-edge-of-numerical-stability
☆102Updated 3 months ago
huggingface / gpt-oss-recipes
Collection of scripts and notebooks for OpenAI's latest GPT OSS models
☆463Updated last month
LAION-AI / AIW
Alice in Wonderland code base for experiments and raw experiments data
☆131Updated last month
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆201Updated last week
adobe-research / dynasaur
Official repository for "DynaSaur: Large Language Agents Beyond Predefined Actions"
☆349Updated 10 months ago
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆145Updated 8 months ago
ekinakyurek / marc
Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"
☆330Updated 11 months ago
SakanaAI / natural_niches
The code repository of the paper: Competition and Attraction Improve Model Fusion
☆161Updated 2 months ago
alexzhang13 / rlm
Super basic implementation (gist-like) of RLMs with REPL environments.
☆132Updated last week
eqimp / hogwild_llm
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
☆127Updated 2 months ago
Pints-AI / 1.5-Pints
A compact LLM pretrained in 9 days by using high quality data
☆330Updated 6 months ago
SakanaAI / ShinkaEvolve
ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution
☆584Updated last week
QuixiAI / grokadamw
☆136Updated last year
NVlabs / hymba
☆201Updated 10 months ago
tokenbender / avataRL
rl from zero pretrain, can it be done? yes.
☆277Updated 3 weeks ago