π₯ A minimal training framework for scaling FLA models
β380Apr 22, 2026Updated last week
Alternatives and similar repositories for flame
Users that are interested in flame are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- β32Dec 31, 2025Updated 3 months ago
- π Efficient implementations for emerging model architecturesβ4,999Updated this week
- β135Jun 6, 2025Updated 10 months ago
- Here we will test various linear attention designs.β62Apr 25, 2024Updated 2 years ago
- Flash-Muon: An Efficient Implementation of Muon Optimizerβ248Jun 15, 2025Updated 10 months ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Seβ¦β68Apr 24, 2024Updated 2 years ago
- π³ Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"β989Feb 5, 2026Updated 2 months ago
- Triton implement of bi-directional (non-causal) linear attentionβ75Mar 1, 2026Updated last month
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruningβ150Feb 25, 2026Updated 2 months ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weightsβ19Oct 9, 2022Updated 3 years ago
- β70Jul 8, 2025Updated 9 months ago
- Awesome Triton Resourcesβ40Apr 27, 2025Updated last year
- Linear Attention Sequence Parallelism (LASP)β88Jun 4, 2024Updated last year
- Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on β¦β16Sep 18, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- β59Jul 9, 2024Updated last year
- Experiments on the impact of depth in transformers and SSMs.β40Oct 23, 2025Updated 6 months ago
- β12Jan 29, 2021Updated 5 years ago
- Helpful tools and examples for working with flex-attentionβ1,179Apr 13, 2026Updated 2 weeks ago
- HGRN2: Gated Linear RNNs with State Expansionβ57Aug 20, 2024Updated last year
- Official PyTorch Implementation of the Longhorn Deep State Space Modelβ57Dec 4, 2024Updated last year
- [ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Ruleβ555Mar 13, 2026Updated last month
- Expanding linear RNN state-transition matrix eigenvalues to include negatives improves state-tracking tasks and language modeling withoutβ¦β21Mar 15, 2025Updated last year
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's lβ¦β57Mar 31, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- β52May 19, 2025Updated 11 months ago
- A PyTorch native platform for training generative AI modelsβ5,258Apr 23, 2026Updated last week
- β130Feb 4, 2026Updated 2 months ago
- Flash-Linear-Attention models beyond languageβ21Aug 28, 2025Updated 8 months ago
- Ring attention implementation with flash attentionβ1,014Sep 10, 2025Updated 7 months ago
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitzβ¦β14Oct 17, 2023Updated 2 years ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"β255Jan 31, 2025Updated last year
- β45Nov 1, 2025Updated 5 months ago
- Stick-breaking attentionβ63Jul 1, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)β36Jan 18, 2025Updated last year
- FlexAttention w/ FlashAttention3 Supportβ27Oct 5, 2024Updated last year
- Fork of Flame repo for training of some new stuff in developmentβ19Updated this week
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformerβ64Jul 30, 2023Updated 2 years ago
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modelingβ40Dec 2, 2023Updated 2 years ago
- Muon fsdp 2β55Aug 8, 2025Updated 8 months ago
- Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span β¦β14Aug 25, 2023Updated 2 years ago