facebookresearch / memoryLinks
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsely activated memory layers complement compute-heavy dense feed-forward layers, providing dedicated capacity to store and retrieve information cheaply.
β358Updated 11 months ago
Alternatives and similar repositories for memory
Users that are interested in memory are comparing it to the libraries listed below
Sorting:
- Tina: Tiny Reasoning Models via LoRAβ309Updated 2 months ago
- πΎ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.β573Updated last month
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β347Updated 7 months ago
- Parallel Scaling Law for Language Model β Beyond Parameter and Inference Time Scalingβ456Updated 6 months ago
- [ICML 2024] CLLMs: Consistency Large Language Modelsβ406Updated last year
- β224Updated last week
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).β300Updated 3 weeks ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β202Updated last year
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.β272Updated last month
- β202Updated 11 months ago
- PyTorch building blocks for the OLMo ecosystemβ482Updated this week
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"β249Updated 10 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) modelsβ223Updated 3 weeks ago
- An extension of the nanoGPT repository for training small MOE models.β215Updated 8 months ago
- A project to improve skills of large language modelsβ628Updated this week
- Normalized Transformer (nGPT)β194Updated last year
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"β315Updated last year
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cacheβ133Updated 3 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.β173Updated 10 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β329Updated last year
- PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attentionβ¦β294Updated last year
- PyTorch implementation of models from the Zamba2 series.β186Updated 10 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Modelsβ231Updated last month
- Build your own visual reasoning modelβ415Updated last week
- Exploring Applications of GRPOβ249Updated 3 months ago
- Pretraining and inference code for a large-scale depth-recurrent language modelβ850Updated last month
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"β340Updated 3 weeks ago
- Implementation of π₯₯ Coconut, Chain of Continuous Thought, in Pytorchβ180Updated 5 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"β562Updated last month
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.β145Updated 9 months ago