taco-group / Re-Align
A novel alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models.
☆39Updated last month
Alternatives and similar repositories for Re-Align:
Users that are interested in Re-Align are comparing it to the libraries listed below
- AutoTrust, a groundbreaking benchmark designed to assess the trustworthiness of DriveVLMs. This work aims to enhance public safety by ens…☆44Updated 3 months ago
- Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing☆33Updated 3 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆65Updated 10 months ago
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?☆27Updated 5 months ago
- [ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'☆122Updated last week
- [NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"☆33Updated 4 months ago
- ☆71Updated 3 months ago
- Code for "Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More"☆31Updated last week
- ☆50Updated 5 months ago
- ☆38Updated 3 months ago
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆81Updated 6 months ago
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.☆65Updated 3 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆70Updated 10 months ago
- ☆59Updated 2 weeks ago
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…☆28Updated 2 months ago
- Official code for ICLR 2024 paper "Do Generated Data Always Help Contrastive Learning?"☆30Updated last year
- ☆44Updated this week
- ☆16Updated 4 months ago
- [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆45Updated this week
- VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues☆40Updated 3 weeks ago
- This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).☆37Updated 5 months ago
- LEO: A powerful Hybrid Multimodal LLM☆17Updated 2 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆65Updated last month
- AutoHallusion Codebase (EMNLP 2024)☆19Updated 4 months ago
- CLIP-MoE: Mixture of Experts for CLIP☆29Updated 6 months ago
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆85Updated last month
- ☆11Updated 5 months ago
- [LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)☆27Updated 3 weeks ago
- ☆59Updated this week
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆82Updated last year