Mamba-Chat: A chat LLM based on the state-space model architecture π
β943Mar 3, 2024Updated 2 years ago
Alternatives and similar repositories for mamba-chat
Users that are interested in mamba-chat are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Mamba SSM architectureβ18,481Jun 15, 2026Updated 2 weeks ago
- Simple, minimal implementation of the Mamba SSM in one file of PyTorch.β2,957Mar 8, 2024Updated 2 years ago
- Implementation of the Mamba SSM with hf_integration.β55Aug 31, 2024Updated last year
- Inference of Mamba, Mamba2 and Mamba3 models in pure Cβ202Mar 18, 2026Updated 3 months ago
- Some preliminary explorations of Mamba's context scaling.β221Feb 8, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A simple and efficient Mamba implementation in pure PyTorch and MLX.β1,468May 3, 2026Updated last month
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modelingβ965Nov 16, 2025Updated 7 months ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.β8,999May 3, 2024Updated 2 years ago
- Code repository for Black Mambaβ265Feb 8, 2024Updated 2 years ago
- Annotated version of the Mamba paperβ501Feb 27, 2024Updated 2 years ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAIβ1,410Apr 11, 2024Updated 2 years ago
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)β¦β14,572Jun 13, 2026Updated 2 weeks ago
- A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplestβ¦β473Jun 22, 2026Updated last week
- Tools for merging pretrained large language models.β7,173Jun 17, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Repository for StripedHyena, a state-of-the-art beyond Transformer architectureβ434Mar 7, 2024Updated 2 years ago
- Awesome list of papers that extend Mamba to various applications.β141Jun 4, 2026Updated 3 weeks ago
- YaRN: Efficient Context Window Extension of Large Language Modelsβ1,729Apr 17, 2024Updated 2 years ago
- Go ahead and axolotl questionsβ12,082Updated this week
- Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zetaβ127Jun 22, 2026Updated last week
- β68Dec 8, 2023Updated 2 years ago
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinksβ7,233Jul 11, 2024Updated last year
- β212Jun 17, 2026Updated last week
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modelingβ225Jun 22, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Structured state space sequence modelsβ2,909Jul 17, 2024Updated last year
- Robust recipes to align language models with human and AI preferencesβ5,614May 26, 2026Updated last month
- β‘ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Plβ¦β2,177Oct 8, 2024Updated last year
- Training LLMs with QLoRA + FSDPβ1,549Nov 9, 2024Updated last year
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"β219Jun 22, 2026Updated last week
- PyTorch native post-training libraryβ5,777Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β13,430Jun 18, 2026Updated last week
- The official implementation of Self-Play Fine-Tuning (SPIN)β1,245May 8, 2024Updated 2 years ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIMβ62Apr 8, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- an implementation of Self-Extend, to expand the context window via grouped attentionβ119Jan 7, 2024Updated 2 years ago
- Using multiple LLMs for ensemble Forecastingβ16Jan 17, 2024Updated 2 years ago
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Callingβ1,860Jul 10, 2024Updated last year
- Reading list for research topics in state-space modelsβ365May 18, 2026Updated last month
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Modelsβ1,692Mar 8, 2024Updated 2 years ago
- β36Nov 22, 2024Updated last year
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMathβ9,485Jun 7, 2025Updated last year