LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft
☆47Jul 17, 2024Updated last year
Alternatives and similar repositories for LLaVA-NeXT-Image-Llama3-Lora
Users that are interested in LLaVA-NeXT-Image-Llama3-Lora are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆43Apr 2, 2024Updated 2 years ago
- ☆158Oct 31, 2024Updated last year
- Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection☆18Mar 19, 2024Updated 2 years ago
- ☆13Dec 17, 2022Updated 3 years ago
- Official repository for SuperCATs : Cost Aggregation with Transformers for Sparse Correspondence (ICCE-Asia'22)☆18Dec 31, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Universal-Noise Annotation☆25Dec 23, 2023Updated 2 years ago
- ☆27Dec 26, 2023Updated 2 years ago
- MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models☆32Jan 22, 2025Updated last year
- The Official Code Repo for EgoOrientBench [CVPR25]☆17Nov 24, 2025Updated 7 months ago
- Official implementation of "ControlFace: Harnessing Facial Parametric Control for Face Rigging".☆44Mar 5, 2025Updated last year
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆124Jul 1, 2024Updated last year
- [NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation☆14Oct 7, 2023Updated 2 years ago
- A fork to add multimodal model training to open-r1☆1,576Feb 8, 2025Updated last year
- [ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer☆37Oct 18, 2023Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆46Jul 3, 2024Updated last year
- Training code for CLIP-FlanT5☆31Jul 29, 2024Updated last year
- Matryoshka Multimodal Models☆123Jan 22, 2025Updated last year
- Pseudo-code Instructions dataset☆27Dec 18, 2023Updated 2 years ago
- ☆27Jan 25, 2024Updated 2 years ago
- This is the official code of "Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation, NeurIPS 23"☆27Dec 7, 2023Updated 2 years ago
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆48Jul 17, 2025Updated 11 months ago
- Threestudio extension of the paper "Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation".☆47Mar 12, 2024Updated 2 years ago
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆44Jun 7, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆104Jan 30, 2024Updated 2 years ago
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆105Nov 30, 2025Updated 7 months ago
- The download methods of Vision-language Continual Pretraining Dataset P9D.☆12Jan 3, 2025Updated last year
- Dataset for GAN-Generated Images Detection☆10Apr 25, 2024Updated 2 years ago
- [CVPR'24] MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding☆18Dec 13, 2024Updated last year
- Official implementation of "AM-Adapter: Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis in-the-Wild" (ICCV 2025)☆26Jul 8, 2025Updated 11 months ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆26May 30, 2024Updated 2 years ago
- ☆4,695Jun 15, 2026Updated 2 weeks ago
- CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs (CVPR2024)☆17Jun 14, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆87Oct 26, 2025Updated 8 months ago
- 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)☆843Aug 5, 2025Updated 10 months ago
- Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)☆23Nov 25, 2025Updated 7 months ago
- CVPR 2025 (Highlight) : Official implementation of "Cross-View Completion Models are Zero-shot Correspondence Estimators"☆69Jun 23, 2025Updated last year
- ☆60Updated this week
- Repo for paper "T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs"☆48Sep 3, 2025Updated 9 months ago
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆26Jan 14, 2025Updated last year