facebookresearch / memoryLinks
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsely activated memory layers complement compute-heavy dense feed-forward layers, providing dedicated capacity to store and retrieve information cheaply.
β360Updated last year
Alternatives and similar repositories for memory
Users that are interested in memory are comparing it to the libraries listed below
Sorting:
- Tina: Tiny Reasoning Models via LoRAβ310Updated 2 months ago
- πΎ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.β582Updated last month
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β349Updated 7 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"β249Updated 10 months ago
- [ICML 2024] CLLMs: Consistency Large Language Modelsβ408Updated last year
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.β276Updated last month
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).β332Updated this week
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) modelsβ226Updated last month
- Parallel Scaling Law for Language Model β Beyond Parameter and Inference Time Scalingβ463Updated 7 months ago
- PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attentionβ¦β294Updated last year
- β205Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β202Updated last year
- Simple & Scalable Pretraining for Neural Architecture Researchβ305Updated 2 weeks ago
- Normalized Transformer (nGPT)β193Updated last year
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.β173Updated 11 months ago
- β225Updated 3 weeks ago
- A project to improve skills of large language modelsβ665Updated this week
- Code for the paper: "Learning to Reason without External Rewards"β383Updated 5 months ago
- An extension of the nanoGPT repository for training small MOE models.β218Updated 9 months ago
- PyTorch implementation of models from the Zamba2 series.β186Updated 10 months ago
- PyTorch building blocks for the OLMo ecosystemβ563Updated this week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β343Updated last year
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"β340Updated last month
- Reproducible, flexible LLM evaluationsβ305Updated last month
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Modelsβ233Updated 2 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β336Updated this week
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cacheβ136Updated 4 months ago
- Pretraining and inference code for a large-scale depth-recurrent language modelβ855Updated 2 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)β162Updated 8 months ago
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.β149Updated 10 months ago