vmarinowski / infini-attentionLinks
An unofficial pytorch implementation of 'Efficient Infinite Context Transformers with Infini-attention'
☆52Updated 11 months ago
Alternatives and similar repositories for infini-attention
Users that are interested in infini-attention are comparing it to the libraries listed below
Sorting:
- This is the official repository for Inheritune.☆112Updated 5 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆56Updated last week
- A repository for research on medium sized language models.☆78Updated last year
- Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M cont…☆83Updated last year
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆160Updated 3 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆98Updated 10 months ago
- GoldFinch and other hybrid transformer components☆46Updated last year
- ☆37Updated last year
- Implementation of the Mamba SSM with hf_integration.☆56Updated 11 months ago
- ☆83Updated 11 months ago
- This is a simple torch implementation of the high performance Multi-Query Attention☆16Updated last year
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆65Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated 11 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 3 months ago
- RWKV-7: Surpassing GPT☆94Updated 8 months ago
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆61Updated last year
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆82Updated 2 months ago
- ☆51Updated 9 months ago
- ☆83Updated 6 months ago
- Collection of autoregressive model implementation☆86Updated 3 months ago
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆151Updated 3 months ago
- Code for Zero-Shot Tokenizer Transfer☆133Updated 6 months ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆178Updated last month
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…☆51Updated 4 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆103Updated 3 months ago
- ☆53Updated 8 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆47Updated 3 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆56Updated last year
- DPO, but faster 🚀☆43Updated 8 months ago
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Updated 5 months ago