Farzad-R / Finetune-LLAVA-NEXTLinks
This repository contains codes for fine-tuning LLAVA-1.6-7b-mistral (Multimodal LLM) model.
☆40Updated last year
Alternatives and similar repositories for Finetune-LLAVA-NEXT
Users that are interested in Finetune-LLAVA-NEXT are comparing it to the libraries listed below
Sorting:
- This is implementation of finetuning BLIP model for Visual Question Answering☆83Updated 2 years ago
- An open-source implementaion for fine-tuning Llama3.2-Vision series by Meta.☆175Updated 3 months ago
- An open-source implementaion for fine-tuning SmolVLM.☆62Updated 5 months ago
- Florence-2 is a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-lan…☆150Updated last year
- [EMNLP'23] ClimateGPT: a specialized LLM for conversations related to Climate Change and Sustainability topics in both English and Arabi…☆79Updated last year
- LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning☆195Updated last year
- AIN - The First Arabic Inclusive Large Multimodal Model. It is a versatile bilingual LMM excelling in visual and contextual understanding…☆51Updated 10 months ago
- Fine tuning OpenAI's CLIP model on Indian Fashion Dataset☆52Updated 2 years ago
- vision language models finetuning notebooks & use cases (Medgemma - paligemma - florence .....)☆61Updated 4 months ago
- Fine-tuning Qwen2.5-VL for vision-language tasks | Optimized for Vision understanding | LoRA & PEFT support.☆151Updated last year
- Bio-Medical EXpert LMM with English and Arabic Language Capabilities☆73Updated 3 months ago
- [NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context☆173Updated last year
- [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts☆336Updated last year
- LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation☆38Updated last year
- ☆53Updated last year
- ☆32Updated 2 years ago
- An open-source implementaion for fine-tuning Phi3-Vision and Phi3.5-Vision by Microsoft.☆98Updated 4 months ago
- Odd-One-Out: Anomaly Detection by Comparing with Neighbors (CVPR25)☆54Updated 3 weeks ago
- Official implementation of "Delving into CLIP latent space for Video Anomaly Recognition", CVIU 2024☆93Updated 4 months ago
- (CVPR 2024) Point, Segment and Count: A Generalized Framework for Object Counting☆122Updated last year
- ☆70Updated 7 months ago
- [BMVC 2025] Official Implementation of the paper "PerSense: Personalized Instance Segmentation in Dense Images"☆28Updated last month
- [EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation☆92Updated last year
- ☆85Updated 6 months ago
- ☆51Updated 7 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆98Updated last year
- An easy way to apply LoRA to CLIP. Implementation of the paper "Low-Rank Few-Shot Adaptation of Vision-Language Models" (CLIP-LoRA) [CVPR…☆283Updated 8 months ago
- GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)☆72Updated 2 years ago
- Image Instance Segmentation - Zero Shot - OpenAI's CLIP + Meta's SAM☆74Updated 2 years ago
- NoLA Codebase☆28Updated 9 months ago