Train, tune, and infer Bamba model
β137May 15, 2026Updated this week
Alternatives and similar repositories for bamba
Users that are interested in bamba are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- FMS Model Optimizer is a framework for developing reduced precision neural network models.β21May 12, 2026Updated last week
- π Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.β14Jan 30, 2026Updated 3 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Modeβ¦β124Sep 13, 2024Updated last year
- β33May 26, 2024Updated last year
- OmegaViT (Ξ©ViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space modβ¦β14May 15, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competβ¦β18Aug 28, 2024Updated last year
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradiβ¦β16May 12, 2026Updated last week
- PyTorch implementation of models from the Zamba2 series.β192Jan 23, 2025Updated last year
- β17Dec 19, 2024Updated last year
- Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on β¦β16Sep 18, 2025Updated 8 months ago
- Code repo for efficient quantized MoE inference with mixture of low-rank compensatorsβ36Apr 14, 2025Updated last year
- Official PyTorch Implementation of the Longhorn Deep State Space Modelβ57Dec 4, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activatedβ37Aug 14, 2024Updated last year
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale β¦β20Oct 13, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Quantized Attention on GPUβ44Nov 22, 2024Updated last year
- β52Jan 28, 2024Updated 2 years ago
- Python toolsβ14Oct 22, 2023Updated 2 years ago
- Triton implement of bi-directional (non-causal) linear attentionβ75Mar 1, 2026Updated 2 months ago
- HGRN2: Gated Linear RNNs with State Expansionβ57Aug 20, 2024Updated last year
- Cray-LM unified training and inference stack.β22Jan 30, 2025Updated last year
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"β69Apr 11, 2025Updated last year
- β137Jun 6, 2025Updated 11 months ago
- A swarm of LLM agents that will help you test, document, and productionize your code!β19May 11, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- β26Dec 13, 2024Updated last year
- ANE accelerated embedding models!β19Dec 11, 2024Updated last year
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inferenceβ60Nov 20, 2024Updated last year
- β20Dec 24, 2024Updated last year
- β47Nov 10, 2023Updated 2 years ago
- Google Researchβ47Oct 29, 2022Updated 3 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Modelsβ35Jun 12, 2024Updated last year
- A self-hosted version of WaterCrawl, a powerful web crawling and data extraction platform.β13Jul 27, 2025Updated 9 months ago
- β157Mar 4, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.β89Mar 27, 2026Updated last month
- β131Feb 4, 2026Updated 3 months ago
- β213Dec 11, 2024Updated last year
- A specialized RWKV-7 model for Othello(a.k.a. Reversi) that predicts legal moves, evaluates positions, and performs in-context search. Itβ¦β44Jan 25, 2025Updated last year
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"β22Oct 14, 2025Updated 7 months ago
- β52Feb 5, 2025Updated last year
- From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment: The Power of NLP and LLMs Combined.β12May 14, 2024Updated 2 years ago