facebookresearch / MemoryMosaics
Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.
☆37Updated 3 months ago
Alternatives and similar repositories for MemoryMosaics:
Users that are interested in MemoryMosaics are comparing it to the libraries listed below
- ☆51Updated 7 months ago
- The repository contains code for Adaptive Data Optimization☆21Updated last month
- ☆69Updated 5 months ago
- ☆41Updated last year
- ☆79Updated 3 months ago
- ☆23Updated 2 months ago
- ☆21Updated last week
- This repo is based on https://github.com/jiaweizzhao/GaLore☆23Updated 4 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆102Updated last month
- Triton Implementation of HyperAttention Algorithm☆46Updated last year
- Using FlexAttention to compute attention with different masking patterns☆40Updated 3 months ago
- ☆29Updated 10 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆25Updated 9 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆90Updated 2 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆66Updated 2 months ago
- Universal Neurons in GPT2 Language Models☆27Updated 7 months ago
- nanoGPT-like codebase for LLM training☆83Updated this week
- Open source replication of Anthropic's Crosscoders for Model Diffing☆28Updated 2 months ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆61Updated 5 months ago
- ☆53Updated last year
- Latest Weight Averaging (NeurIPS HITY 2022)☆28Updated last year
- Experiments for efforts to train a new and improved t5☆77Updated 9 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆79Updated last year
- Language models scale reliably with over-training and on downstream tasks☆96Updated 9 months ago
- ☆44Updated last year
- ☆65Updated 6 months ago
- Sparse and discrete interpretability tool for neural networks☆58Updated 11 months ago
- ☆43Updated 2 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆66Updated 9 months ago
- Efficient Scaling laws and collaborative pretraining.☆13Updated 2 months ago