HKUST-LongGroup / Awesome-MLLM-Benchmarks
☆75Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for Awesome-MLLM-Benchmarks
- ☆24Updated 4 months ago
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆41Updated 4 months ago
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆75Updated 2 months ago
- Official implementation of "Why are Visually-Grounded Language Models Bad at Image Classification?" (NeurIPS 2024)☆51Updated last month
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆34Updated 6 months ago
- This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World"…☆44Updated 8 months ago
- [AAAI2023] Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task (Oral)☆38Updated 7 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆48Updated 5 months ago
- Official PyTorch code of "Grounded Question-Answering in Long Egocentric Videos", accepted by CVPR 2024.☆51Updated 2 months ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆96Updated last week
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆68Updated 6 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆43Updated 5 months ago
- [ICCV2023] - CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation☆29Updated last month
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆73Updated 7 months ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…☆72Updated 7 months ago
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆47Updated 2 months ago
- Official repository for CoMM Dataset☆24Updated 2 months ago
- NegCLIP.☆26Updated last year
- A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability☆33Updated 2 weeks ago
- LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft☆39Updated 4 months ago
- The official implementation of 《MLLMs-Augmented Visual-Language Representation Learning》☆31Updated 8 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆41Updated last year
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆66Updated last year
- ☆17Updated last year
- VisualGPTScore for visio-linguistic reasoning☆26Updated last year
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆64Updated last month
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆16Updated last month
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆94Updated last year
- [ICCV 2023] Prompt-aligned Gradient for Prompt Tuning☆151Updated last year
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆32Updated 4 months ago