Stevetich / EventHallusion
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
☆24Updated last month
Related projects ⓘ
Alternatives and complementary repositories for EventHallusion
- [NeurIPS 2024] Lumen: a Large multimodal model with versatile vision-centric capabilities☆22Updated last month
- ☆18Updated 2 weeks ago
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models☆15Updated 3 months ago
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆65Updated last year
- FreeVA: Offline MLLM as Training-Free Video Assistant☆48Updated 5 months ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆40Updated 4 months ago
- Official code for paper: Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language☆21Updated 4 months ago
- This repository is the code of paper 'DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark'.☆63Updated this week
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆28Updated 3 weeks ago
- [ICCV 2023] Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting☆15Updated 11 months ago
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆18Updated last month
- ☆49Updated last week
- The official implementation of RAR☆72Updated 7 months ago
- ☆11Updated 4 months ago
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆34Updated 6 months ago
- ☆89Updated last year
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆25Updated 2 months ago
- ☆104Updated 8 months ago
- ☆11Updated last month
- [ACM MM 2024] ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack☆12Updated 3 months ago
- official impelmentation of Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input☆54Updated 2 months ago
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆28Updated 3 weeks ago
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆17Updated last month
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆43Updated 2 months ago
- Official github repo for ICCV2023 paper 'Multi-event Video-Text Retrieval'☆18Updated 8 months ago
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆39Updated 7 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆32Updated 4 months ago
- ☆64Updated 6 months ago
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆94Updated last year