DLYuanGod / TinyGPT-VLinks

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

☆1,297

Alternatives and similar repositories for TinyGPT-V

Users that are interested in TinyGPT-V are comparing it to the libraries listed below

Sorting:

PKU-YuanGroup / Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
☆3,330Updated 8 months ago
PKU-YuanGroup / MoE-LLaVA
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
☆2,205Updated 3 weeks ago
mlfoundations / MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
☆819Updated last year
myshell-ai / JetMoE
Reaching LLaMA2 Performance with 0.1M Dollars
☆989Updated last year
baaivision / Emu
Emu Series: Generative Multimodal Models from BAAI
☆1,741Updated 10 months ago
LLaVA-VL / LLaVA-Plus-Codebase
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
☆755Updated last year
Alpha-VLLM / LLaMA2-Accessory
An Open-source Toolkit for LLM Development
☆2,787Updated 6 months ago
lucidrains / self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
☆1,394Updated last year
facebookresearch / MetaCLIP
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expert…
☆1,624Updated this week
BAAI-DCAI / Bunny
A family of lightweight multimodal models.
☆1,024Updated 8 months ago
eric-ai-lab / MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
☆861Updated 3 months ago
mbzuai-oryx / MobiLlama
[ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices
☆653Updated 2 months ago
Meituan-AutoML / MobileVLM
Strong and Open Vision Language Assistant for Mobile Devices
☆1,250Updated last year
SkunkworksAI / BakLLaVA
☆712Updated last year
penghao-wu / vstar
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
☆656Updated last year
mbzuai-oryx / LLaVA-pp
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
☆839Updated this week
LLaVA-VL / LLaVA-Interactive-Demo
LLaVA-Interactive-Demo
☆376Updated last year
facebookresearch / MobileLLM
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
☆1,315Updated 3 months ago
apple / ml-aim
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
☆1,341Updated 3 months ago
dvlab-research / MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
☆3,301Updated last year
XueFuzhao / OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
☆1,571Updated last year
uclaml / SPIN
The official implementation of Self-Play Fine-Tuning (SPIN)
☆1,185Updated last year
redotvideo / mamba-chat
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
☆928Updated last year
lichao-sun / Mora
Mora: More like Sora for Generalist Video Generation
☆1,565Updated 9 months ago
AviSoori1x / makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
☆732Updated 9 months ago
allenai / mmc4
MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
☆936Updated 4 months ago
princeton-nlp / MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
☆1,119Updated last year
facebookresearch / chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
☆2,040Updated last year
MDK8888 / GPTFast
Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.
☆685Updated 11 months ago
XuezheMax / megalodon
Reference implementation of Megalodon 7B model
☆524Updated 2 months ago