geronimi73 / mambaLinks

☆31

Alternatives and similar repositories for mamba

Users that are interested in mamba are comparing it to the libraries listed below

Sorting:

LegallyCoder / mamba-hf
Implementation of the Mamba SSM with hf_integration.
☆56Updated 10 months ago
Oxen-AI / mamba-dive
This is the code that went into our practical dive using mamba as information extraction
☆53Updated last year
kyegomez / Kosmos-X
The Next Generation Multi-Modality Superintelligence
☆70Updated 10 months ago
CERC-AAI / Robin
☆63Updated 10 months ago
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 3 months ago
devvrit / matformer
MatFormer repo
☆53Updated 7 months ago
tanaymeh / mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆55Updated last year
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆112Updated 5 months ago
geronimi73 / phi2-finetune
☆87Updated last year
Zyphra / BlackMamba
Code repository for Black Mamba
☆250Updated last year
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆81Updated 2 months ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
samchaineau / llm_slerp_generation
Repo hosting codes and materials related to speeding LLMs' inference using token merging.
☆36Updated last year
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆165Updated 5 months ago
Pleias / Various-Finetuning
Set of scripts to finetune LLMs
☆37Updated last year
HanGuo97 / lq-lora
☆127Updated last year
RobertCsordas / moe_attention
Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"
☆98Updated 9 months ago
kyegomez / MambaByte
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
☆120Updated 3 months ago
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆133Updated last year
Moocember / Optimization-by-PROmpting
☆78Updated last year
recursal / GoldFinch-paper
GoldFinch and other hybrid transformer components
☆46Updated last year
kyegomez / OpenStrawberry
An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO
☆30Updated last week
arcee-ai / DAM
☆53Updated 8 months ago
kyegomez / Paper-Implementation-Template
A simple reproducible template to implement AI research papers
☆23Updated 10 months ago
IBM / ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…
☆223Updated last year
schwartz-lab-NLP / TOVA
Token Omission Via Attention
☆128Updated 9 months ago
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆159Updated 3 months ago
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆63Updated 2 years ago
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆130Updated last month
jlamprou / Infini-Attention
Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M cont…
☆83Updated last year