Stevetich / EventHallusionLinks
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
☆31Updated 2 months ago
Alternatives and similar repositories for EventHallusion
Users that are interested in EventHallusion are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Lumen: a Large multimodal model with versatile vision-centric capabilities☆24Updated 9 months ago
- ☆21Updated 5 months ago
- MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆40Updated 2 months ago
- Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency☆42Updated 3 weeks ago
- Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025☆17Updated 2 months ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆36Updated 11 months ago
- ☆24Updated 8 months ago
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models☆17Updated 11 months ago
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆32Updated 2 months ago
- The official repository for ACL2025 paper "PruneVid: Visual Token Pruning for Efficient Video Large Language Models".☆46Updated last month
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models☆66Updated 11 months ago
- Official repository for CoMM Dataset☆36Updated 5 months ago
- Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆49Updated last month
- ☆86Updated 3 months ago
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆18Updated 3 months ago
- This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehens…☆71Updated last month
- [ICLR 2025] TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning☆33Updated 2 months ago
- Official implement of MIA-DPO☆58Updated 5 months ago
- Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models☆38Updated 11 months ago
- [CVPR 2025] Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Att…☆23Updated 4 months ago
- [CVPRW 2025] UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inpu…☆86Updated 2 months ago
- TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos☆51Updated last week
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆45Updated 11 months ago
- ☆122Updated 4 months ago
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆19Updated 9 months ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆31Updated 2 months ago
- [LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)☆29Updated last month
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?☆31Updated 7 months ago
- Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal Prompting☆40Updated last week
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency☆111Updated last month