foundation-model-stack/bamba

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/foundation-model-stack/bamba)

foundation-model-stack / bamba

Train, tune, and infer Bamba model

☆138

Alternatives and similar repositories for bamba

Users that are interested in bamba are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

goombalab / phi-mamba
View on GitHub
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…
☆126Sep 13, 2024Updated last year
BBuf / flash-rwkv
View on GitHub
☆32May 26, 2024Updated 2 years ago
foundation-model-stack / fms-acceleration
View on GitHub
🚀 Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.
☆14Jan 30, 2026Updated 5 months ago
BaichuanSEED / BaichuanSEED.github.io
View on GitHub
Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…
☆18Aug 28, 2024Updated last year
kyegomez / OmniByteFormer
View on GitHub
OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…
☆15Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
AlirezaMorsali / MLP-Attention
View on GitHub
☆17Dec 19, 2024Updated last year
IBM / selective-dense-state-space-model
View on GitHub
Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on …
☆16Sep 18, 2025Updated 10 months ago
Zyphra / Zamba2
View on GitHub
PyTorch implementation of models from the Zamba2 series.
☆193Jan 23, 2025Updated last year
Agora-Lab-AI / OmegaViT
View on GitHub
OmegaViT (ΩViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space mod…
☆15Updated this week
Cranial-XIX / longhorn
View on GitHub
Official PyTorch Implementation of the Longhorn Deep State Space Model
☆57Dec 4, 2024Updated last year
foundation-model-stack / fms-guardrails-orchestrator
View on GitHub
🚀 Guardrails orchestration server for application of various detections on text generation input and output.
☆35Jul 6, 2026Updated 2 weeks ago
The-Swarm-Corporation / OmniParse
View on GitHub
Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …
☆20Oct 13, 2025Updated 9 months ago
feifeibear / ChituAttention
View on GitHub
Quantized Attention on GPU
☆45Nov 22, 2024Updated last year
RiddleHe / nanochat
View on GitHub
The best ChatGPT that $100 can buy.
☆54Updated this week
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
sustcsonglin / mamba-triton
View on GitHub
☆52Jan 28, 2024Updated 2 years ago
stas00 / python-tools
View on GitHub
Python tools
☆14Oct 22, 2023Updated 2 years ago
cray-lm / cray-lm
View on GitHub
Cray-LM unified training and inference stack.
☆22Jan 30, 2025Updated last year
OpenSparseLLMs / Linear-MoE
View on GitHub
☆139Jun 6, 2025Updated last year
automl / unlocking_state_tracking
View on GitHub
Expanding linear RNN state-transition matrix eigenvalues to include negatives improves state-tracking tasks and language modeling without…
☆22Mar 15, 2025Updated last year
efficientscaling / Z1
View on GitHub
[EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"
☆69Apr 11, 2025Updated last year
OpenNLPLab / HGRN2
View on GitHub
HGRN2: Gated Linear RNNs with State Expansion
☆58Aug 20, 2024Updated last year
kyegomez / Hedgehog
View on GitHub
Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"
☆16Mar 11, 2024Updated 2 years ago
0xWelt / VibeRL
View on GitHub
VibeRL is a Reinforcement Learning framework built essentially through vibe coding with Kimi K2.
☆17Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
fla-org / flash-bidirectional-linear-attention
View on GitHub
Triton implement of bi-directional (non-causal) linear attention
☆78Mar 1, 2026Updated 4 months ago
kyegomez / dev-swarm
View on GitHub
A swarm of LLM agents that will help you test, document, and productionize your code!
☆19Updated this week
cosdt / vllm-ascend
View on GitHub
See vLLM official support: https://github.com/vllm-project/vllm-ascend
☆11Feb 5, 2025Updated last year
Narsil / hf-chat
View on GitHub
☆25Dec 13, 2024Updated last year
nanowell / Q-Sparse-LLM
View on GitHub
My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
☆37Aug 14, 2024Updated last year
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
microsoft / ArchScale
View on GitHub
Simple & Scalable Pretraining for Neural Architecture Research
☆340Mar 31, 2026Updated 3 months ago
NonvolatileMemory / flash_tree_attn
View on GitHub
☆20Dec 24, 2024Updated last year
ariG23498 / mmdp
View on GitHub
☆38Jul 8, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Violet24K / Eywa
View on GitHub
Heterogeneous Scientific Foundation Model Collaboration
☆23May 1, 2026Updated 2 months ago
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated 2 years ago
limenghao / AdaTune
View on GitHub
This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).
☆14May 16, 2021Updated 5 years ago
madsys-dev / deepseekv2-profile
View on GitHub
☆156Mar 4, 2025Updated last year
watercrawl / self-hosted
View on GitHub
A self-hosted version of WaterCrawl, a powerful web crawling and data extraction platform.
☆13Jul 27, 2025Updated 11 months ago
NX-AI / mlstm_kernels
View on GitHub
Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.
☆90Jul 6, 2026Updated 2 weeks ago
NVlabs / hymba
View on GitHub
☆214Dec 11, 2024Updated last year