Train, tune, and infer Bamba model
β138Jun 4, 2025Updated 10 months ago
Alternatives and similar repositories for bamba
Users that are interested in bamba are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- FMS Model Optimizer is a framework for developing reduced precision neural network models.β21Apr 3, 2026Updated last week
- π Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.β14Jan 30, 2026Updated 2 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Modeβ¦β123Sep 13, 2024Updated last year
- β33May 26, 2024Updated last year
- OmegaViT (Ξ©ViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space modβ¦β14Mar 30, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competβ¦β18Aug 28, 2024Updated last year
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradiβ¦β15Updated this week
- β36Feb 26, 2024Updated 2 years ago
- PyTorch implementation of models from the Zamba2 series.β193Jan 23, 2025Updated last year
- β17Dec 19, 2024Updated last year
- Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on β¦β16Sep 18, 2025Updated 6 months ago
- Code repo for efficient quantized MoE inference with mixture of low-rank compensatorsβ36Apr 14, 2025Updated 11 months ago
- Official PyTorch Implementation of the Longhorn Deep State Space Modelβ57Dec 4, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activatedβ34Aug 14, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale β¦β20Oct 13, 2025Updated 5 months ago
- Quantized Attention on GPUβ44Nov 22, 2024Updated last year
- β51Jan 28, 2024Updated 2 years ago
- Triton implement of bi-directional (non-causal) linear attentionβ74Mar 1, 2026Updated last month
- HGRN2: Gated Linear RNNs with State Expansionβ57Aug 20, 2024Updated last year
- Python toolsβ14Oct 22, 2023Updated 2 years ago
- Cray-LM unified training and inference stack.β22Jan 30, 2025Updated last year
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"β68Apr 11, 2025Updated last year
- β134Jun 6, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- β26Dec 13, 2024Updated last year
- ANE accelerated embedding models!β20Dec 11, 2024Updated last year
- A swarm of LLM agents that will help you test, document, and productionize your code!β16Mar 30, 2026Updated last week
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inferenceβ58Nov 20, 2024Updated last year
- β19Dec 24, 2024Updated last year
- β46Nov 10, 2023Updated 2 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Modelsβ35Jun 12, 2024Updated last year
- β155Mar 4, 2025Updated last year
- FlexiTokensβ19Dec 27, 2025Updated 3 months ago
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.β88Mar 27, 2026Updated 2 weeks ago
- β129Feb 4, 2026Updated 2 months ago
- β211Dec 11, 2024Updated last year
- A specialized RWKV-7 model for Othello(a.k.a. Reversi) that predicts legal moves, evaluates positions, and performs in-context search. Itβ¦β44Jan 25, 2025Updated last year
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"β22Oct 14, 2025Updated 5 months ago
- β52Feb 5, 2025Updated last year
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)β32Apr 9, 2025Updated last year