ππ΅π» Yo'LLaVA: Your Personalized Language and Vision Assistant (NeurIPS 2024)
β123Mar 26, 2025Updated last year
Alternatives and similar repositories for YoLLaVA
Users that are interested in YoLLaVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)β187Jul 5, 2024Updated last year
- Official Repository of Personalized Visual Instruct Tuningβ34Mar 6, 2025Updated last year
- [CVPRW 2025] Official repository of paper titled "Towards Evaluating the Robustness of Visual State Space Models"β26Jun 8, 2025Updated 10 months ago
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".β12Oct 11, 2024Updated last year
- A curated list of Awesome Personalized Large Multimodal Models resourcesβ57Mar 26, 2026Updated last month
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing theirβ¦β22Jan 11, 2026Updated 3 months ago
- Streaming Video Diffusion: Online Video Editing with Diffusion Modelsβ18Jun 3, 2024Updated last year
- πΈ A collection of Vietnamese women who are currently working in the field of Computer Science.β16Updated this week
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequencesβ44Mar 11, 2025Updated last year
- [MICCAI 2025] Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathologyβ12Jun 17, 2025Updated 10 months ago
- Official implementation of MC-LLaVA.β141Mar 17, 2026Updated last month
- [TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Studyβ16Nov 22, 2024Updated last year
- πΈ Code and Dataset for our ACL 2023 paper: "MPCHAT: Towards Multimodal Persona-Grounded Conversation"β22Sep 5, 2023Updated 2 years ago
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoningβ24Sep 9, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Matryoshka Multimodal Modelsβ123Jan 22, 2025Updated last year
- MR. Video: MapReduce is the Principle for Long Video Understandingβ31Apr 23, 2025Updated last year
- Compress conventional Vision-Language Pre-training dataβ52Sep 22, 2023Updated 2 years ago
- (ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generatorβ114Mar 21, 2025Updated last year
- β11Jun 21, 2025Updated 10 months ago
- [ICLR 2024] Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyondβ22Apr 29, 2024Updated 2 years ago
- [COLM'25] Official implementation of the Law of Vision Representation in MLLMsβ176Oct 6, 2025Updated 7 months ago
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"β52Jun 16, 2025Updated 10 months ago
- π Visual Instruction Inversion: Image Editing via Visual Prompting (NeurIPS 2023)β97Dec 19, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understandingβ50Jan 14, 2025Updated last year
- [CVPR 2023] Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detectionβ31Jun 21, 2023Updated 2 years ago
- On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, β¦β20Mar 13, 2026Updated last month
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Modelsβ40Nov 10, 2024Updated last year
- βοΈ Edit One for All: Interactive Batch Image Editing (CVPR 2024)β69Aug 8, 2024Updated last year
- [β CVPR 2025 Highlight β] Official Implementation of the paper STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing froβ¦β30Apr 22, 2025Updated last year
- Official pytorch implementation of "Interpreting the Second-Order Effects of Neurons in CLIP"β42Nov 15, 2024Updated last year
- Official implementation of CVPR 2024 paper "Prompt Learning via Meta-Regularization".β32Mar 10, 2025Updated last year
- Coloring lips and drawing glasses on faces in custom images or live webcamβ11Sep 10, 2019Updated 6 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- β21Mar 18, 2026Updated last month
- γNeurIPS 2024γDense Connector for MLLMsβ183Oct 14, 2024Updated last year
- Pioneering in Vietnamese Multimodal Large Language Modelβ53Jan 23, 2025Updated last year
- Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examplesβ39Nov 27, 2024Updated last year
- [CVPR 2023] Official repository of paper titled "CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent β¦β102Mar 25, 2024Updated 2 years ago
- Official code and dataset for our EMNLP 2024 Findings paper: Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knβ¦β19Dec 27, 2024Updated last year
- β11Oct 2, 2024Updated last year