facebookresearch / MemoryMosaicsLinks

Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.

☆45

Alternatives and similar repositories for MemoryMosaics

Users that are interested in MemoryMosaics are comparing it to the libraries listed below

Sorting:

berlino / seq_icl
☆53Updated last year
RobertCsordas / moeut
☆82Updated 10 months ago
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆123Updated 7 months ago
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆75Updated 8 months ago
ScalingIntelligence / large_language_monkeys
☆97Updated 9 months ago
JoshEngels / MultiDimensionalFeatures
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆77Updated 7 months ago
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆147Updated 2 weeks ago
sjelassi / transformers_ssm_copy
☆32Updated last year
mcleish7 / gemstone-scaling-laws
☆27Updated 5 months ago
taufeeque9 / codebook-features
Sparse and discrete interpretability tool for neural networks
☆63Updated last year
apple / ml-planner
☆53Updated last year
dayal-kalra / low-memory-adam
☆11Updated 4 months ago
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆103Updated 2 months ago
katiekang1998 / reasoning_generalization
☆33Updated 6 months ago
dvruette / barrel-rec-pytorch
☆53Updated last year
kyleliang919 / Online-Subspace-Descent
[NeurIPS 2024] Low rank memory efficient optimizer without SVD
☆30Updated 2 weeks ago
ml-jku / EVA
One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation
☆40Updated 9 months ago
Asap7772 / understanding-rlhf
Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…
☆29Updated last year
huyphan168 / PEER
Mixture of A Million Experts
☆46Updated 11 months ago
google-deepmind / spectral_ssm
☆32Updated last year
abhishekpanigrahi1996 / transformer_in_transformer
☆45Updated last year
jopetty / word-problem
Experiments on the impact of depth in transformers and SSMs.
☆32Updated 8 months ago
wesg52 / universal-neurons
Universal Neurons in GPT2 Language Models
☆30Updated last year
EleutherAI / improved-t5
Experiments for efforts to train a new and improved t5
☆76Updated last year
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆97Updated last year
goombalab / phi-mamba
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…
☆110Updated 10 months ago
schwartz-lab-NLP / TOVA
Token Omission Via Attention
☆128Updated 9 months ago
gregorbachmann / Next-Token-Failures
☆87Updated last year
amirzandieh / HyperAttention
Triton Implementation of HyperAttention Algorithm
☆48Updated last year
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆219Updated last month