Train, tune, and infer Bamba model
β137Jun 4, 2025Updated 10 months ago
Alternatives and similar repositories for bamba
Users that are interested in bamba are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- FMS Model Optimizer is a framework for developing reduced precision neural network models.β21Apr 3, 2026Updated 3 weeks ago
- π Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.β14Jan 30, 2026Updated 3 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Modeβ¦β123Sep 13, 2024Updated last year
- β33May 26, 2024Updated last year
- OmegaViT (Ξ©ViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space modβ¦β14Apr 20, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competβ¦β18Aug 28, 2024Updated last year
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradiβ¦β16Apr 13, 2026Updated 2 weeks ago
- β36Feb 26, 2024Updated 2 years ago
- PyTorch implementation of models from the Zamba2 series.β193Jan 23, 2025Updated last year
- β17Dec 19, 2024Updated last year
- Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on β¦β16Sep 18, 2025Updated 7 months ago
- Code repo for efficient quantized MoE inference with mixture of low-rank compensatorsβ36Apr 14, 2025Updated last year
- Official PyTorch Implementation of the Longhorn Deep State Space Modelβ57Dec 4, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activatedβ34Aug 14, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale β¦β20Oct 13, 2025Updated 6 months ago
- Quantized Attention on GPUβ44Nov 22, 2024Updated last year
- β52Jan 28, 2024Updated 2 years ago
- Python toolsβ14Oct 22, 2023Updated 2 years ago
- Triton implement of bi-directional (non-causal) linear attentionβ75Mar 1, 2026Updated 2 months ago
- HGRN2: Gated Linear RNNs with State Expansionβ57Aug 20, 2024Updated last year
- Cray-LM unified training and inference stack.β22Jan 30, 2025Updated last year
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"β68Apr 11, 2025Updated last year
- See vLLM official support: https://github.com/vllm-project/vllm-ascendβ11Feb 5, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- β135Jun 6, 2025Updated 10 months ago
- A swarm of LLM agents that will help you test, document, and productionize your code!β18Apr 25, 2026Updated last week
- β26Dec 13, 2024Updated last year
- ANE accelerated embedding models!β20Dec 11, 2024Updated last year
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inferenceβ60Nov 20, 2024Updated last year
- β20Dec 24, 2024Updated last year
- β47Nov 10, 2023Updated 2 years ago
- This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).β14May 16, 2021Updated 4 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Modelsβ35Jun 12, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- GRadient-INformed MoEβ264Sep 25, 2024Updated last year
- A self-hosted version of WaterCrawl, a powerful web crawling and data extraction platform.β13Jul 27, 2025Updated 9 months ago
- β157Mar 4, 2025Updated last year
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.β89Mar 27, 2026Updated last month
- β130Feb 4, 2026Updated 2 months ago
- β212Dec 11, 2024Updated last year
- β52Feb 5, 2025Updated last year