shufangxun/LLaVA-MoD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shufangxun/LLaVA-MoD)

shufangxun / LLaVA-MoD

[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

☆227

Alternatives and similar repositories for LLaVA-MoD

Users that are interested in LLaVA-MoD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Fantasyele / LLaVA-KD
View on GitHub
[ICCV 2025] Official implementation of LLaVA-KD: A Framework of Distilling Multimodal Large Language Models
☆134Oct 14, 2025Updated 9 months ago
fqhank / CVPR2025_Align-KD
View on GitHub
☆39Jun 2, 2026Updated last month
ZhangAIPI / YOPO_MLLM_Pruning
View on GitHub
Pruning the VLLMs
☆106Dec 9, 2024Updated last year
tiremoscode / dw-grupo58
View on GitHub
☆20Nov 28, 2024Updated last year
buyi-Yang / getQzonehistory
View on GitHub
☆12Nov 13, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
abduvalimurodullayev1 / boilerplate_Drf
View on GitHub
This is the boilerplate for django project. There are so many settings configurations
☆10Nov 7, 2025Updated 8 months ago
dongyh20 / Insight-V
View on GitHub
[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
☆240Nov 7, 2025Updated 8 months ago
KarineAyrs / knowledge-distillation-semantic-search
View on GitHub
KDSS is the framework for knowledge distillation from LLMs
☆12Nov 5, 2025Updated 8 months ago
TinyLLaVA / TinyLLaVA_Factory
View on GitHub
A Framework of Small-scale Large Multimodal Models
☆992Updated this week
asudahkzj / Wnet
View on GitHub
Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks
☆24Sep 6, 2022Updated 3 years ago
swordlidev / Efficient-Multimodal-LLMs-Survey
View on GitHub
Efficient Multimodal Large Language Models: A Survey
☆386Apr 29, 2025Updated last year
PKU-YuanGroup / MoE-LLaVA
View on GitHub
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
☆2,322Jul 15, 2025Updated last year
ludc506 / InternVL-X
View on GitHub
☆16Mar 26, 2025Updated last year
daixiangzi / Awesome-Token-Compress
View on GitHub
A paper list of some recent works about Token Compress for Vit and VLM
☆939Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
wzhuang-xmu / LoSA
View on GitHub
[ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".
☆25Mar 16, 2025Updated last year
TobyYang7 / Llava_Qwen2
View on GitHub
Visual Instruction Tuning for Qwen2 Base Model
☆43Jun 29, 2024Updated 2 years ago
Cooperx521 / PyramidDrop
View on GitHub
(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
☆151Mar 6, 2025Updated last year
shoaibahmed / llm_depth_pruning
View on GitHub
Official implementation of the paper: "A deeper look at depth pruning of LLMs"
☆15Jul 24, 2024Updated last year
thunlp / LLaVA-UHD
View on GitHub
LLaVA-UHD v3: Progressive Visual Compression for Efficient Native-Resolution Encoding in MLLMs
☆423Jul 6, 2026Updated 2 weeks ago
TideDra / lmm-r1
View on GitHub
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
☆847May 14, 2025Updated last year
AlignGPT-VL / AlignGPT
View on GitHub
Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"
☆34Jul 12, 2024Updated 2 years ago
maifoundations / GCoT
View on GitHub
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation
☆15Aug 11, 2025Updated 11 months ago
LLaVA-VL / LLaVA-NeXT
View on GitHub
☆4,710Jun 15, 2026Updated last month
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
spyflying / CMPC-Refseg
View on GitHub
Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.
☆64Feb 2, 2021Updated 5 years ago
MILVLG / imp
View on GitHub
a family of highly capabale yet efficient large multimodal models
☆194Aug 23, 2024Updated last year
EvolvingLMMs-Lab / LLaVA-OneVision-1.5-RL
View on GitHub
Fully Open Framework for Democratized Multimodal Reinforcement Learning.
☆51Dec 19, 2025Updated 7 months ago
cambrian-mllm / cambrian
View on GitHub
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
☆2,008Nov 7, 2025Updated 8 months ago
percent4 / yi_vl_experiment
View on GitHub
本项目是关于Yi的多模态系列模型，如Yi-VL-6B/34B等的实验与应用。
☆14Jan 25, 2024Updated 2 years ago
zjunlp / Deco
View on GitHub
[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
☆146Sep 11, 2025Updated 10 months ago
FreedomIntelligence / ALLaVA
View on GitHub
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
☆281Jun 25, 2024Updated 2 years ago
westlake-baichuan-mllm / bc-omni
View on GitHub
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊
☆273Jan 27, 2025Updated last year
aladinD / SafeMERGE
View on GitHub
Code for SafeMERGE (ICLR 2025).
☆15Apr 1, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
RLHF-V / RLAIF-V
View on GitHub
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
☆456May 14, 2025Updated last year
Sreyan88 / VDGD
View on GitHub
Code for ICLR 2025 Paper: Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
☆25May 7, 2025Updated last year
zhoujiahuan1991 / ICML2025-TCPA
View on GitHub
☆23May 8, 2025Updated last year
JIA-Lab-research / VisionZip
View on GitHub
Official repository for VisionZip (CVPR 2025)
☆443Jul 21, 2025Updated last year
CircleRadon / TokenPacker
View on GitHub
The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025
☆279May 26, 2025Updated last year
yuyq96 / TextHawk
View on GitHub
Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
☆68Nov 1, 2024Updated last year
WisconsinAIVision / ViP-LLaVA
View on GitHub
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
☆338Jul 17, 2024Updated 2 years ago