ShyFoo / NemesisLinks
Official implementation of Nemesis: Normalizing the Soft-prompt Vectors of Vision-Language Models (ICLR 2024 Spotlight)
☆13Updated 5 months ago
Alternatives and similar repositories for Nemesis
Users that are interested in Nemesis are comparing it to the libraries listed below
Sorting:
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆16Updated 3 weeks ago
- The official repo of our work "Pensieve: Retrospect-then-Compare mitigates Visual Hallucination"☆16Updated last year
- [NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"☆35Updated 6 months ago
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆29Updated 6 months ago
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆85Updated 9 months ago
- Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)☆58Updated last year
- ☆21Updated 7 months ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆58Updated 2 months ago
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆19Updated 4 months ago
- ☆12Updated 4 months ago
- ☆43Updated 5 months ago
- Official implementation of MC-LLaVA.☆28Updated this week
- ☆18Updated last month
- [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆59Updated this week
- ☆36Updated last month
- The official repo for "Where do Large Vision-Language Models Look at when Answering Questions?"☆37Updated 2 weeks ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆68Updated last year
- ☆31Updated this week
- Repo for paper "T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs"☆49Updated 2 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆88Updated last year
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆36Updated 2 weeks ago
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆46Updated 3 weeks ago
- Official Repository of Personalized Visual Instruct Tuning☆28Updated 3 months ago
- [CVPR2025] BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding☆20Updated 2 months ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆84Updated last year
- ☆36Updated 2 weeks ago
- ☆37Updated 10 months ago
- Fast-Slow Thinking for Large Vision-Language Model Reasoning☆14Updated last month
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆50Updated 7 months ago
- Official code for paper "GRIT: Teaching MLLMs to Think with Images"☆64Updated this week