Victorwz / LLaVA-Llama-3View external linksLinks
Reproduction of LLaVA-v1.5 based on Llama-3-8b LLM backbone.
☆65Oct 25, 2024Updated last year
Alternatives and similar repositories for LLaVA-Llama-3
Users that are interested in LLaVA-Llama-3 are comparing it to the libraries listed below
Sorting:
- [ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning☆159Aug 8, 2025Updated 6 months ago
- [ECCV’24] Official repository for "BEAF: Observing Before-AFter Changes to Evaluate Hallucination in Vision-language Models"☆21Mar 26, 2025Updated 10 months ago
- ☆12Dec 20, 2024Updated last year
- 基于LLaVA1.6微调的Xray识别的多模态大模型☆10Oct 22, 2024Updated last year
- 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)☆849Aug 5, 2025Updated 6 months ago
- [CVPR 2024] DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model☆19Apr 16, 2024Updated last year
- Official Repo for FoodieQA paper (EMNLP 2024)☆19Jun 26, 2025Updated 7 months ago
- [ICLR2025] LLaVA-HR: High-Resolution Large Language-Vision Assistant☆246Aug 14, 2024Updated last year
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- The official implement of Freeze-Omni.☆15Jul 10, 2025Updated 7 months ago
- 基于PaddlePaddle以及wechaty框架 建立的宇宙漫游指南机器人☆17Aug 3, 2021Updated 4 years ago
- ☆16Oct 21, 2024Updated last year
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…☆44Apr 18, 2025Updated 9 months ago
- ☆15Oct 27, 2023Updated 2 years ago
- PyTorch reimplementation of "LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators" publishsed in ICLR 2019☆18Sep 13, 2021Updated 4 years ago
- code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning☆20Jul 16, 2024Updated last year
- [NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context☆173Sep 25, 2024Updated last year
- [WIP@Oct 13] 质衡-基准测试 (Q-Bench in Chinese),包含中文版【底层视觉问答】和【底层视觉描述】数据集,以及中文提示下的图片质量评价。 We will release Q-Bench in more languages in the futu…☆24Jan 7, 2024Updated 2 years ago
- The efficient tuning method for VLMs☆80Mar 10, 2024Updated last year
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆20May 27, 2025Updated 8 months ago
- ☆20Apr 8, 2025Updated 10 months ago
- ☆21Aug 27, 2025Updated 5 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 9 months ago
- 天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛 初赛第三名方案☆50Aug 16, 2023Updated 2 years ago
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆281Jun 25, 2024Updated last year
- FreeDA: Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation (CVPR 2024)☆49Aug 28, 2024Updated last year
- Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance☆23Jan 20, 2025Updated last year
- [CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models☆18Jul 22, 2024Updated last year
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆106Aug 21, 2025Updated 5 months ago
- Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs☆98Jan 16, 2025Updated last year
- Official Implementation (Pytorch) of "DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Represe…☆27Jun 24, 2024Updated last year
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆32Dec 9, 2025Updated 2 months ago
- Future version of the AnyBody Managed Model Repository with a full thoracic spine model.☆18Feb 2, 2026Updated 2 weeks ago
- ☆29May 6, 2020Updated 5 years ago
- [NeurIPS 2024] Efficient Large Multi-modal Models via Visual Context Compression☆67Feb 19, 2025Updated 11 months ago
- hatsApp Message Sender is a simple Python application that allows users to send WhatsApp messages programmatically using the pywhatkit li…☆14Nov 17, 2023Updated 2 years ago
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents☆317Apr 16, 2024Updated last year
- [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆76Nov 20, 2025Updated 2 months ago
- A simple Web AI model deployment tool using JavaScript based on OpenCV.js and ONNXRuntime☆58Jul 8, 2024Updated last year