kyegomez / MultiModalMambaLinks

A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.

☆460

Alternatives and similar repositories for MultiModalMamba

Users that are interested in MultiModalMamba are comparing it to the libraries listed below

Sorting:

Zyphra / BlackMamba
Code repository for Black Mamba
☆259Updated last year
kyegomez / zeta
Build high-performance AI models with modular building blocks
☆567Updated 2 weeks ago
SkunkworksAI / BakLLaVA
☆713Updated last year
SkalskiP / awesome-foundation-and-multimodal-models
👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]
☆637Updated last year
sshh12 / multi_token
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
☆186Updated last year
ContextualAI / lens
This is the official repository for the LENS (Large Language Models Enhanced to See) system.
☆355Updated 4 months ago
redotvideo / mamba-chat
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
☆935Updated last year
microsoft / Samba
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆927Updated this week
dingo-actual / infini-transformer
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…
☆292Updated last year
HyperGAI / HPT
HPT - Open Multimodal LLMs from HyperGAI
☆315Updated last year
sumo43 / loopvlm
run paligemma in real time
☆133Updated last year
penghao-wu / vstar
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
☆681Updated last year
HazyResearch / m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
☆561Updated 10 months ago
mistralai-sf24 / hackathon
☆446Updated last year
kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆195Updated 3 weeks ago
LLaVA-VL / LLaVA-Plus-Codebase
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
☆760Updated last year
NousResearch / Obsidian
Maybe the new state of the art vision model? we'll see 🤷‍♂️
☆165Updated last year
catid / dora
Implementation of DoRA
☆307Updated last year
AviSoori1x / seemore
From scratch implementation of a vision language model in pure PyTorch
☆250Updated last year
LLaVA-VL / LLaVA-Interactive-Demo
LLaVA-Interactive-Demo
☆378Updated last year
groundlight / r1_vlm
Build your own visual reasoning model
☆414Updated last month
gaasher / I-JEPA
Implementation of I-JEPA from "Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture"
☆276Updated 10 months ago
bfshi / TOAST
Official code for "TOAST: Transfer Learning via Attention Steering"
☆186Updated 2 years ago
kyegomez / Python-Package-Template
A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…
☆194Updated last month
adithya-s-k / YoloGemma
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…
☆84Updated last year
apple / ml-veclip
The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
☆246Updated 10 months ago
kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆210Updated last month
apple / ml-aim
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
☆1,383Updated 3 months ago
SkalskiP / SoM
Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️
☆87Updated 2 years ago
lamini-ai / Lamini-Memory-Tuning
Banishing LLM Hallucinations Requires Rethinking Generalization
☆275Updated last year