lucidrains / ring-attention-pytorchLinks

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

☆532

Alternatives and similar repositories for ring-attention-pytorch

Users that are interested in ring-attention-pytorch are comparing it to the libraries listed below

Sorting:

haoliuhl / ringattention
Large Context Attention
☆719Updated 6 months ago
foundation-model-stack / fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…
☆258Updated last week
pytorch-labs / attention-gym
Helpful tools and examples for working with flex-attention
☆904Updated 2 weeks ago
zhuzilin / ring-flash-attention
Ring attention implementation with flash attention
☆828Updated last week
BobMcDear / attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆565Updated this week
apple / ml-cross-entropy
☆506Updated this week
pytorch-labs / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆224Updated last year
hao-ai-lab / Consistency_LLM
[ICML 2024] CLLMs: Consistency Large Language Models
☆397Updated 8 months ago
fla-org / flame
🔥 A minimal training framework for scaling FLA models
☆209Updated last month
zyushun / Adam-mini
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
☆431Updated 2 months ago
HazyResearch / based
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆238Updated last month
shawntan / scattermoe
Triton-based implementation of Sparse Mixture of Experts.
☆230Updated 8 months ago
srush / annotated-mamba
Annotated version of the Mamba paper
☆487Updated last year
NVIDIA-NeMo / RL
Scalable toolkit for efficient model reinforcement
☆558Updated this week
Azure / MS-AMP
Microsoft Automatic Mixed Precision Library
☆616Updated 10 months ago
jzhang38 / LongMamba
Some preliminary explorations of Mamba's context scaling.
☆216Updated last year
lucidrains / speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
☆275Updated 7 months ago
NVIDIA / kvpress
LLM KV cache compression made easy
☆560Updated this week
NVIDIA / ngpt
Normalized Transformer (nGPT)
☆185Updated 8 months ago
mlfoundations / open_lm
A repository for research on medium sized language models.
☆506Updated last month
NVIDIA / Megatron-Energon
Megatron's multi-modal data loader
☆230Updated last week
lucidrains / st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
☆352Updated last year
dingo-actual / infini-transformer
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…
☆290Updated last year
jzhang38 / EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
☆739Updated 10 months ago
fla-org / native-sparse-attention
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
☆731Updated 4 months ago
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆221Updated 2 weeks ago
pytorch / PiPPy
Pipeline Parallelism for PyTorch
☆775Updated 11 months ago
HazyResearch / lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
☆244Updated 6 months ago
lucidrains / nGPT-pytorch
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI
☆288Updated last month
facebookresearch / spdl
Scalable and Performant Data Loading
☆290Updated last week