Fantasyele/LLaVA-KD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Fantasyele/LLaVA-KD)

Fantasyele / LLaVA-KD

[ICCV 2025] Official implementation of LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

☆134

Alternatives and similar repositories for LLaVA-KD

Users that are interested in LLaVA-KD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhangzjn / EMOv2
View on GitHub
[T-PAMI 2025] EMOv2: Pushing 5M Vision Model Frontier
☆54Dec 30, 2024Updated last year
lchen1019 / Align-TI
View on GitHub
[ICML 2026] Beyond Next-Token Alignment: Distilling Multimodal Large Language Models via Token Interactions
☆25Feb 11, 2026Updated 5 months ago
shufangxun / LLaVA-MoD
View on GitHub
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
☆227Mar 31, 2025Updated last year
fqhank / CVPR2025_Align-KD
View on GitHub
☆39Jun 2, 2026Updated last month
sjtuplayer / SaRA
View on GitHub
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
☆122Oct 18, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ZichenWen1 / EPIC
View on GitHub
(NeurIPS 2025 🔥) Official implementation for "Efficient Multi-modal Large Language Models via Progressive Consistency Distillation"
☆49Feb 11, 2026Updated 5 months ago
zhuyjan / WikiSeeker
View on GitHub
[ACL 2026] WikiSeeker: Rethinking the Role of Vision-Language Models in Knowledge-Based Visual Question Answering.
☆15Updated this week
xzc-zju / UltraVideo
View on GitHub
[[NeurIPS 2025] UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions
☆93Jul 14, 2025Updated last year
yang3121099 / LLM-Neo
View on GitHub
The code for paper "LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models"
☆15Mar 2, 2025Updated last year
jsjangAI / VL2Lite
View on GitHub
This repository contains the **official implementation** of the paper: "VL2Lite: Task-Specific Knowledge Distillation from Large Vision-…
☆20Mar 23, 2025Updated last year
hithqd / DynamicControl
View on GitHub
☆41Jan 10, 2025Updated last year
winggan / adeval
View on GitHub
Evaluation Tool for Anomaly Detection Research
☆17May 9, 2024Updated 2 years ago
zhangzjn / Soul
View on GitHub
[CVPR 2026] Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation
☆64Dec 16, 2025Updated 7 months ago
jongwooko / distillm-2
View on GitHub
Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)
☆71Jun 27, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
winycg / CLIP-KD
View on GitHub
[CVPR-2024] Official implementations of CLIP-KD: An Empirical Study of CLIP Model Distillation
☆150Aug 22, 2025Updated 11 months ago
lewandofskee / MobileMamba
View on GitHub
[CVPR25] Official implementation of `MobileMamba: Lightweight Multi-Receptive Visual Mamba Network.'
☆368Mar 20, 2025Updated last year
HaojunChen663 / PixVerve-95K
View on GitHub
Official repository for the paper "PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset"
☆31Jul 10, 2026Updated 2 weeks ago
MikeWangWZHL / dymu
View on GitHub
☆29May 13, 2025Updated last year
zhangzjn / T3-Video
View on GitHub
[ICML 2026] Transform Trained Transformer for Accelerating Native 4K Video Generation
☆41Dec 16, 2025Updated 7 months ago
MILVLG / twigvlm
View on GitHub
Implementation of ICCV 2025 paper "Growing a Twig to Accelerate Large Vision-Language Models".
☆30May 23, 2026Updated 2 months ago
hey-cjj / MoVE-KD
View on GitHub
[CVPR 2025] Official implementation of paper "MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders".
☆53Jun 7, 2025Updated last year
Vision-CAIR / Infinibench
View on GitHub
Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows
☆20Nov 4, 2025Updated 8 months ago
WePOINTS / WePOINTS
View on GitHub
☆190Mar 13, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
mrwu-mac / ControlMLLM
View on GitHub
[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'
☆211Jul 17, 2025Updated last year
AIGC-Explorer / TIMotion
View on GitHub
☆50Jan 15, 2026Updated 6 months ago
songmzhang / DSKD
View on GitHub
Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…
☆63Mar 21, 2026Updated 4 months ago
lizhou-cs / mglmm
View on GitHub
☆32Jun 14, 2026Updated last month
XenoZLH / Shuffle-R1
View on GitHub
Official code repository of Shuffle-R1
☆26Feb 23, 2026Updated 5 months ago
alexlioralexli / attention-transfer
View on GitHub
☆23Nov 19, 2024Updated last year
zhuyjan / MER2025-MRAC25
View on GitHub
[ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.
☆25Nov 25, 2025Updated 8 months ago
lose4578 / CircleRoPE
View on GitHub
☆15Sep 1, 2025Updated 10 months ago
YuHengsss / SD-RPN
View on GitHub
[ICLR2026] Catching the Details: Self-Distilled RoI Predictors for Fine-Grained MLLM Perception
☆17Jan 26, 2026Updated 6 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
zju-SWJ / RLD
View on GitHub
Official implementation for "Knowledge Distillation with Refined Logits".
☆23Aug 26, 2024Updated last year
zamling / PSALM
View on GitHub
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
☆269Dec 30, 2024Updated last year
deepglint / RWKV-CLIP
View on GitHub
[EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner
☆151Dec 14, 2025Updated 7 months ago
IDEA-Research / ChatRex
View on GitHub
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
☆216Oct 15, 2025Updated 9 months ago
TinyLLaVA / TinyLLaVA_Factory
View on GitHub
A Framework of Small-scale Large Multimodal Models
☆995Updated this week
Cooperx521 / PyramidDrop
View on GitHub
(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
☆151Mar 6, 2025Updated last year
MCG-NJU / Sora2-mini
View on GitHub
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions
☆57Dec 16, 2025Updated 7 months ago