flawedmatrix / mamba-ssmLinks

Implementation of mamba with rust

☆88

Alternatives and similar repositories for mamba-ssm

Users that are interested in mamba-ssm are comparing it to the libraries listed below

Sorting:

kroggen / mamba.c
Inference of Mamba models in pure C
☆189Updated last year
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆94Updated 8 months ago
rafacelente / bllama
1.58-bit LLaMa model
☆81Updated last year
QuixiAI / grokadamw
☆134Updated 11 months ago
tanaymeh / mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆56Updated last year
VatsaDev / NanoPhi-alpha
GPT-2 small trained on phi-like data
☆67Updated last year
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆184Updated 6 months ago
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆154Updated 9 months ago
kyegomez / MambaByte
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
☆120Updated 2 weeks ago
Zyphra / BlackMamba
Code repository for Black Mamba
☆252Updated last year
LegallyCoder / mamba-hf
Implementation of the Mamba SSM with hf_integration.
☆56Updated 11 months ago
serp-ai / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆31Updated last year
bloc97 / DeMo
DeMo: Decoupled Momentum Optimization
☆190Updated 8 months ago
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆198Updated last year
kotak-ai / 1.58BitNet
Experimental BitNet Implementation
☆69Updated last month
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆94Updated last year
lucidrains / grokfast-pytorch
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"
☆101Updated 7 months ago
schwartz-lab-NLP / TOVA
Token Omission Via Attention
☆128Updated 9 months ago
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆175Updated last year
serp-ai / unsloth
5X faster 60% less memory QLoRA finetuning
☆21Updated last year
AlpinDale / QuIP-for-Llama
Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models
☆38Updated 2 years ago
keeeeenw / MicroLlama
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆153Updated 2 weeks ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆82Updated 2 months ago
BlinkDL / nanoRWKV
RWKV in nanoGPT style
☆191Updated last year
euclaise / supertrainer2000
☆49Updated last year
lucidrains / HRM
Exploration into the proposed architecture from Sapient Intelligence of Singapore 🇸🇬
☆38Updated this week
QuixiAI / kraken
☆66Updated last year
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 3 months ago
HazyResearch / lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
☆244Updated 6 months ago