Train, tune, and infer Bamba model
β137Jun 4, 2025Updated 9 months ago
Alternatives and similar repositories for bamba
Users that are interested in bamba are comparing it to the libraries listed below
Sorting:
- FMS Model Optimizer is a framework for developing reduced precision neural network models.β21Updated this week
- π Collection of libraries used with fms-hf-tuning to accelerate fine-tuning and training of large models.β13Jan 30, 2026Updated last month
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Modeβ¦β122Sep 13, 2024Updated last year
- β32May 26, 2024Updated last year
- OmegaViT (Ξ©ViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space modβ¦β14Updated this week
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competβ¦β18Aug 28, 2024Updated last year
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradiβ¦β15Updated this week
- PyTorch implementation of models from the Zamba2 series.β189Jan 23, 2025Updated last year
- Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on β¦β16Sep 18, 2025Updated 6 months ago
- β17Dec 19, 2024Updated last year
- Official PyTorch Implementation of the Longhorn Deep State Space Modelβ57Dec 4, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activatedβ34Aug 14, 2024Updated last year
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale β¦β20Oct 13, 2025Updated 5 months ago
- Quantized Attention on GPUβ44Nov 22, 2024Updated last year
- β51Jan 28, 2024Updated 2 years ago
- Triton implement of bi-directional (non-causal) linear attentionβ71Mar 1, 2026Updated 3 weeks ago
- HGRN2: Gated Linear RNNs with State Expansionβ56Aug 20, 2024Updated last year
- Cray-LM unified training and inference stack.β22Jan 30, 2025Updated last year
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"β68Apr 11, 2025Updated 11 months ago
- β133Jun 6, 2025Updated 9 months ago
- β27Dec 13, 2024Updated last year
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inferenceβ57Nov 20, 2024Updated last year
- A swarm of LLM agents that will help you test, document, and productionize your code!β16Feb 16, 2026Updated last month
- ANE accelerated embedding models!β20Dec 11, 2024Updated last year
- β19Dec 24, 2024Updated last year
- β46Nov 10, 2023Updated 2 years ago
- decontaminationβ27Mar 4, 2026Updated 2 weeks ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Modelsβ35Jun 12, 2024Updated last year
- GRadient-INformed MoEβ264Sep 25, 2024Updated last year
- β155Mar 4, 2025Updated last year
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.β87Mar 1, 2026Updated 3 weeks ago
- A self-hosted version of WaterCrawl, a powerful web crawling and data extraction platform.β13Jul 27, 2025Updated 7 months ago
- β125Feb 4, 2026Updated last month
- β209Dec 11, 2024Updated last year
- A specialized RWKV-7 model for Othello(a.k.a. Reversi) that predicts legal moves, evaluates positions, and performs in-context search. Itβ¦β44Jan 25, 2025Updated last year
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"β22Oct 14, 2025Updated 5 months ago
- β52Feb 5, 2025Updated last year
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)β32Apr 9, 2025Updated 11 months ago
- From Dataset Labeling, Entity Extraction to production Knowledge Graph Deployment: The Power of NLP and LLMs Combined.β12May 14, 2024Updated last year