facebookresearch / memoryLinks
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsely activated memory layers complement compute-heavy dense feed-forward layers, providing dedicated capacity to store and retrieve information cheaply.
β366Updated last year
Alternatives and similar repositories for memory
Users that are interested in memory are comparing it to the libraries listed below
Sorting:
- πΎ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.β609Updated this week
- Tina: Tiny Reasoning Models via LoRAβ312Updated 3 months ago
- [ICML 2024] CLLMs: Consistency Large Language Modelsβ411Updated last year
- Parallel Scaling Law for Language Model β Beyond Parameter and Inference Time Scalingβ467Updated 7 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"β249Updated 11 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β351Updated 8 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) modelsβ228Updated 2 months ago
- PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attentionβ¦β294Updated last year
- A project to improve skills of large language modelsβ756Updated this week
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).β340Updated 3 weeks ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.β174Updated 11 months ago
- PyTorch building blocks for the OLMo ecosystemβ681Updated this week
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.β276Updated 2 months ago
- β224Updated last month
- Implementation of π₯₯ Coconut, Chain of Continuous Thought, in Pytorchβ181Updated 6 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"β341Updated 2 months ago
- β204Updated last year
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cacheβ137Updated 4 months ago
- Pretraining and inference code for a large-scale depth-recurrent language modelβ857Updated last week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β346Updated last year
- An extension of the nanoGPT repository for training small MOE models.β224Updated 10 months ago
- Reproducible, flexible LLM evaluationsβ316Updated last month
- Exploring Applications of GRPOβ251Updated 4 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'β234Updated 5 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β201Updated last year
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMsβ198Updated last month
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Modelsβ233Updated 2 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"β575Updated 3 months ago
- Code for the paper: "Learning to Reason without External Rewards"β385Updated 6 months ago
- Minimal hackable GRPO implementationβ309Updated 11 months ago