zjr2000 / REVERIE
[ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
☆15Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for REVERIE
- A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability☆35Updated 2 weeks ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆49Updated 5 months ago
- Composed Video Retrieval☆46Updated 6 months ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆42Updated 4 months ago
- ☆36Updated 7 months ago
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆16Updated last month
- Code for paper "AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention"☆16Updated 4 months ago
- ☆55Updated 3 weeks ago
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆34Updated 6 months ago
- Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs".☆43Updated 3 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆52Updated this week
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆78Updated 8 months ago
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆66Updated last year
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆30Updated last month
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆64Updated last month
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆44Updated last year
- ☆54Updated 4 months ago
- This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…☆17Updated 6 months ago
- Official PyTorch implementation of the paper "Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner"☆15Updated last year
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆68Updated 7 months ago
- Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]☆39Updated 7 months ago
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆32Updated last week
- HallE-Control: Controlling Object Hallucination in LMMs☆28Updated 7 months ago
- Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models☆64Updated last week
- Official PyTorch code of "Grounded Question-Answering in Long Egocentric Videos", accepted by CVPR 2024.☆52Updated 2 months ago
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆58Updated 3 weeks ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆106Updated 3 weeks ago
- MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer☆34Updated 2 months ago
- Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal …☆27Updated this week