TobyYang7/Llava_Qwen2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TobyYang7/Llava_Qwen2)

TobyYang7 / Llava_Qwen2

Visual Instruction Tuning for Qwen2 Base Model

☆43

Alternatives and similar repositories for Llava_Qwen2

Users that are interested in Llava_Qwen2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TobyYang7 / cuhksz_report_template
View on GitHub
CUHKSZ LaTeX Report Template
☆18Jan 4, 2025Updated last year
Niujunbo2002 / NativeRes-LLaVA
View on GitHub
Official code repo for our work "Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models"
☆55Jun 17, 2025Updated last year
DoubtedSteam / DyVTE
View on GitHub
The official implement of "Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings"
☆18Dec 5, 2024Updated last year
zhangguanghao523 / CMMCoT
View on GitHub
[AAAI'26] Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augm…
☆11Dec 5, 2025Updated 7 months ago
hasanar1f / HiRED
View on GitHub
[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…
☆58Apr 18, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
dmis-lab / ETHIC
View on GitHub
[NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
☆16Sep 2, 2025Updated 10 months ago
Z1zs / MMNeuron
View on GitHub
Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our co…
☆26Dec 20, 2024Updated last year
HKUST-LongGroup / DyME
View on GitHub
[ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration
☆18Mar 18, 2026Updated 4 months ago
luka-group / mDPO
View on GitHub
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆88Nov 10, 2024Updated last year
xiaomi-research / timeviper
View on GitHub
[CVPR'26] TimeViper: A Hybrid Mamba-Transformer Vision-Language Model for Efficient Long Video Understanding
☆25Jan 4, 2026Updated 6 months ago
LALBJ / PAI
View on GitHub
[ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
☆171Nov 6, 2024Updated last year
aimagelab / LLaVA-MORE
View on GitHub
[ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning
☆160Aug 8, 2025Updated 11 months ago
XMUDeepLIT / AVG-LLaVA
View on GitHub
Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"
☆33Oct 12, 2024Updated last year
saccharomycetes / mllms_know
View on GitHub
[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'
☆381Apr 20, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
HJYao00 / DenseConnector
View on GitHub
【NeurIPS 2024】Dense Connector for MLLMs
☆183Oct 14, 2024Updated last year
Wang-Xiaodong1899 / Awesome-Multimodal-Large-Language-Models
View on GitHub
🔥Awesome Multimodal Large Language Models Paper List
☆154Mar 12, 2025Updated last year
Liuziyu77 / MIA-DPO
View on GitHub
Official implement of MIA-DPO
☆69Jan 23, 2025Updated last year
TinyLLaVA / TinyLLaVA_Factory
View on GitHub
A Framework of Small-scale Large Multimodal Models
☆995Updated this week
djm209 / HSTGODE
View on GitHub
HSTGODE code
☆11Nov 26, 2023Updated 2 years ago
jindongwang / BUAA-Recommend-Graduate-Test
View on GitHub
北航2013年计算机夏令营机试题，2 problems written in C language,2013
☆10Jul 21, 2015Updated 11 years ago
1zhou-Wang / MemVR
View on GitHub
[ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…
☆171Sep 25, 2025Updated 10 months ago
zsgvivo / VideoZoomer
View on GitHub
☆34Feb 12, 2026Updated 5 months ago
IemProg / MiMi
View on GitHub
🔥 🔥 [WACV2024] Mini but Mighty: Finetuning ViTs with Mini Adapters
☆20Jul 5, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
shufangxun / LLaVA-MoD
View on GitHub
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
☆227Mar 31, 2025Updated last year
yuecao0119 / MMFuser
View on GitHub
The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …
☆63Nov 5, 2024Updated last year
Yaxin9Luo / Gamma-MOD
View on GitHub
[ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models
☆45Oct 28, 2025Updated 9 months ago
xuyang-liu16 / GlobalCom2
View on GitHub
[AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models
☆42Jan 27, 2026Updated 6 months ago
ywh187 / FitPrune
View on GitHub
☆68Jan 23, 2026Updated 6 months ago
opendatalab / HA-DPO
View on GitHub
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
☆104Jan 30, 2024Updated 2 years ago
fusiming3 / MARS
View on GitHub
Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
☆86Jul 16, 2024Updated 2 years ago
chencn2020 / TeacherIQA
View on GitHub
☆14Aug 12, 2025Updated 11 months ago
senorfy / Kinect
View on GitHub
用Kinect2.0读取图像的深度等信息，分割出手部图像。用HOG提取手部图像信息，接着用SVM进行训练。目的是为了识别手势。
☆10Jan 8, 2020Updated 6 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
MCG-NJU / p-MoD
View on GitHub
[ICCV 2025] p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
☆44Jun 26, 2025Updated last year
xuyang-liu16 / V2Drop
View on GitHub
[CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Models
☆34May 27, 2026Updated 2 months ago
nianfd / RWKV-VG
View on GitHub
☆10Dec 3, 2024Updated last year
yuanc3 / DATE
View on GitHub
Use 2 lines to empower absolute time awareness for Qwen2.5VL's MRoPE
☆29Sep 20, 2025Updated 10 months ago
HKUST-LongGroup / Diff-II
View on GitHub
[CVPR 2025] PyTorch implementation of Diff-II
☆29Feb 27, 2025Updated last year
ali-vilab / matrix
View on GitHub
☆34Apr 8, 2025Updated last year
shannany0606 / CCMP
View on GitHub
Learning Cross-View Object Correspondence via Cycle-Consistent Mask Prediction (CVPR 2026)
☆15Feb 27, 2026Updated 5 months ago