pritamqu / HALVAView external linksLinks
[ICLR 2025] Data-Augmented Phrase-Level Alignment for Mitigating Object Hallucination
☆19Jan 27, 2025Updated last year
Alternatives and similar repositories for HALVA
Users that are interested in HALVA are comparing it to the libraries listed below
Sorting:
- Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images☆18Jun 4, 2025Updated 8 months ago
- [ICLR 2025] This repo is the official implementation of our paper "Learning Fine-Grained Representations through Textual Token Disentangl…☆22Jul 28, 2025Updated 6 months ago
- (ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆58Jan 26, 2026Updated 2 weeks ago
- Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)☆23Nov 25, 2025Updated 2 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization☆100Jan 30, 2024Updated 2 years ago
- Preference Learning for LLaVA☆59Nov 9, 2024Updated last year
- Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…☆23Jan 26, 2025Updated last year
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding"☆57Jan 23, 2026Updated 3 weeks ago
- The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"☆42Apr 27, 2025Updated 9 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning☆38Mar 21, 2025Updated 10 months ago
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆75May 31, 2025Updated 8 months ago
- ☆13Aug 28, 2024Updated last year
- The code of CVPR2024 "S^2MVTC: a Simple yet Efficient Scalable Multi-View Tensor Clustering "☆11Apr 3, 2024Updated last year
- ☆11Mar 23, 2024Updated last year
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…☆64Jan 27, 2026Updated 2 weeks ago
- Agentic Keyframe Search for Video Question Answering☆15Apr 7, 2025Updated 10 months ago
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inference☆10Dec 15, 2024Updated last year
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆20Aug 1, 2025Updated 6 months ago
- About PyTorch implementation for ‘’Robust Multi-View Clustering with Noisy Correspondence‘’ (TKDE 2024)☆11Aug 2, 2024Updated last year
- quagga☆10Apr 7, 2020Updated 5 years ago
- Self-supervised adversarial masking for point clouds☆11Jul 12, 2023Updated 2 years ago
- 🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".