Train, tune, and infer Bamba model
☆137Jun 4, 2025Updated 8 months ago
Alternatives and similar repositories for bamba
Users that are interested in bamba are comparing it to the libraries listed below
Sorting:
- OmegaViT (ΩViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space mod…☆14Feb 9, 2026Updated 3 weeks ago
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Aug 28, 2024Updated last year
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆121Sep 13, 2024Updated last year
- FMS Model Optimizer is a framework for developing reduced precision neural network models.☆21Feb 23, 2026Updated last week
- 🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.☆13Jan 30, 2026Updated last month
- ☆32May 26, 2024Updated last year
- Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on …☆14Sep 18, 2025Updated 5 months ago
- PyTorch implementation of models from the Zamba2 series.☆187Jan 23, 2025Updated last year
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆15Feb 23, 2026Updated last week
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference☆57Nov 20, 2024Updated last year
- Quantized Attention on GPU☆44Nov 22, 2024Updated last year
- ☆30Aug 21, 2025Updated 6 months ago
- ☆36Feb 26, 2024Updated 2 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆56Dec 4, 2024Updated last year
- HGRN2: Gated Linear RNNs with State Expansion☆56Aug 20, 2024Updated last year
- ☆129Jun 6, 2025Updated 8 months ago
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆20Oct 13, 2025Updated 4 months ago
- ☆20Dec 24, 2024Updated last year
- Code repo for efficient quantized MoE inference with mixture of low-rank compensators☆31Apr 14, 2025Updated 10 months ago
- Generative Modeling with Bayesian Sample Inference☆24May 17, 2025Updated 9 months ago
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.☆86Feb 18, 2026Updated last week
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆68Apr 11, 2025Updated 10 months ago
- ☆20Jan 6, 2023Updated 3 years ago
- ☆45Nov 10, 2023Updated 2 years ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆215Jan 30, 2026Updated last month
- GRadient-INformed MoE☆264Sep 25, 2024Updated last year
- Cray-LM unified training and inference stack.☆22Jan 30, 2025Updated last year
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.☆91Jul 17, 2025Updated 7 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆102Sep 30, 2024Updated last year
- ☆27Dec 13, 2024Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Jun 6, 2024Updated last year
- decontamination☆26Dec 3, 2025Updated 2 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆14Apr 30, 2025Updated 10 months ago
- A set of Jupyter notebooks and codes used for visualizing and processing micro tomography data. This code serves as supplemental material…☆10Sep 17, 2025Updated 5 months ago
- For ACL25 paper "WAFFLE: Multi-Modal Model for Automated Front-End Development" - by Shanchao Liang and Nan Jiang and Shangshu Qian and L…☆11May 28, 2025Updated 9 months ago
- Run all the tests at the same time with modal.com☆11Mar 2, 2024Updated last year
- FlexiTokens☆18Dec 27, 2025Updated 2 months ago