Visual Instruction Tuning for Qwen2 Base Model
☆41Jun 29, 2024Updated last year
Alternatives and similar repositories for Llava_Qwen2
Users that are interested in Llava_Qwen2 are comparing it to the libraries listed below
Sorting:
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…☆44Apr 18, 2025Updated 10 months ago
- The official implement of "Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings"☆18Dec 5, 2024Updated last year
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆47Nov 10, 2024Updated last year
- ☆11Updated this week
- Offical implementation of "Efficient 3D Recognition with Event-driven Spike Sparse Convolution" (AAAI2025)☆27Jul 7, 2025Updated 8 months ago
- Train deepseek r1-like reasoning LLM with ease | 轻松训练1个deepseek r1类的推理LLM☆18Feb 15, 2025Updated last year
- Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our co…☆25Dec 20, 2024Updated last year
- [ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'☆349Apr 20, 2025Updated 10 months ago
- The first decoder-only multimodal state space model☆100May 19, 2025Updated 9 months ago
- ☆24Dec 26, 2024Updated last year
- ☆29Aug 25, 2024Updated last year
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆66Dec 8, 2025Updated 3 months ago
- 模型 llava-Qwen2-7B-Instruct-Chinese-CLIP 增强中文文字识别能力和表情包内涵识别能力,接近gpt4o、claude-3.5-sonnet的识别水平!☆27Jul 23, 2024Updated last year
- ☆29Feb 27, 2025Updated last year
- [ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…☆172Sep 25, 2025Updated 5 months ago
- [ICCV 2025] ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models☆49Jul 7, 2025Updated 8 months ago
- A simple WeChat Official Account layout tool based on Dify☆17Jun 27, 2025Updated 8 months ago
- [ICRA 2024] WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection☆12Feb 6, 2024Updated 2 years ago
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 9 months ago
- ☆26Feb 28, 2026Updated last week
- [ICLR 2025] Official PyTorch Implementation for CPE: Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Ga…☆12Apr 7, 2025Updated 11 months ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆79Sep 6, 2024Updated last year
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆40Mar 2, 2026Updated last week
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆38Jan 27, 2026Updated last month
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆33Oct 12, 2024Updated last year
- InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models☆91Feb 2, 2026Updated last month
- [ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning☆160Aug 8, 2025Updated 7 months ago
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated 2 months ago
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆28Feb 13, 2026Updated 3 weeks ago
- 2024年第十五届蓝桥杯全国总决赛 Python A组全国一等奖纪念(题目+考场代码)☆13Jan 10, 2025Updated last year
- ☆11Aug 29, 2025Updated 6 months ago
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆29Jan 13, 2026Updated last month
- Implementation of the CVPR2025 paper LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty.☆17Sep 10, 2025Updated 6 months ago
- Workflow automation, but you just describe what you want and it happens.☆27Nov 22, 2025Updated 3 months ago
- ☆28Dec 4, 2025Updated 3 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆86Nov 10, 2024Updated last year
- [ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs☆163Nov 6, 2024Updated last year
- A Framework of Small-scale Large Multimodal Models☆963Feb 7, 2026Updated last month
- [ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos☆119Dec 12, 2025Updated 2 months ago