vmarinowski / infini-attentionLinks

An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'

☆53

Alternatives and similar repositories for infini-attention

Users that are interested in infini-attention are comparing it to the libraries listed below

Sorting:

TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
jlamprou / Infini-Attention
Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M cont…
☆84Updated last year
recursal / GoldFinch-paper
GoldFinch and other hybrid transformer components
☆45Updated last year
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆56Updated this week
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆115Updated 8 months ago
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆162Updated 6 months ago
Zyphra / Zyda_processing
☆39Updated last year
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆99Updated last year
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆98Updated 11 months ago
RWKV / ZeroCoT
https://x.com/BlinkDL_AI/status/1884768989743882276
☆28Updated 5 months ago
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆110Updated 6 months ago
nanowell / Q-Sparse-LLM
My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
☆33Updated last year
kyegomez / MultiQueryAttention
This is a simple torch implementation of the high performance Multi-Query Attention
☆15Updated 2 years ago
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 6 months ago
lucidrains / PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
☆129Updated last year
LegallyCoder / mamba-hf
Implementation of the Mamba SSM with hf_integration.
☆56Updated last year
wdlctc / mini-s
☆52Updated 11 months ago
GAIR-NLP / Entropy-ABF
Official implementation for 'Extending LLMs’ Context Window with 100 Samples'
☆80Updated last year
frankxwang / dpo-prefix-sharing
DPO, but faster 🚀
☆45Updated 10 months ago
kyegomez / LM-Infinite
Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆39Updated 11 months ago
yynil / RWKVInside
☆38Updated 5 months ago
kjslag / spacebyte
A byte-level decoder architecture that matches the performance of tokenized Transformers.
☆66Updated last year
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated last year
schwartz-lab-NLP / TOVA
Token Omission Via Attention
☆127Updated last year
RWKV / RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…
☆53Updated 7 months ago
lucidrains / coconut-pytorch
Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch
☆179Updated 4 months ago
OpenMOSE / RWKV-Infer
A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…
☆45Updated last week
RobertCsordas / moeut
☆86Updated last year
SalesforceAIResearch / GemFilter
☆85Updated 9 months ago
tanaymeh / mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆59Updated last year