mbzuai-oryx / CVRR-Evaluation-Suite
Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs".
☆39Updated 3 weeks ago
Related projects: ⓘ
- Composed Video Retrieval☆42Updated 4 months ago
- [NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization☆93Updated 7 months ago
- Official implementation of the paper "STEREO: Towards Adversarially Robust Concept Erasing from Text-to-Image Generation Models"☆15Updated last week
- [CVPRW 2024] Official repository of paper titled "Learning to Prompt with Text Only Supervision for Vision-Language Models".☆82Updated 3 weeks ago
- Contains code and documentation for our VANE-Bench paper.☆10Updated 3 months ago
- [CVPR' 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆35Updated last month
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆65Updated last year
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆85Updated last week
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆23Updated 6 months ago
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆91Updated last year
- Official PyTorch code of "Grounded Question-Answering in Long Egocentric Videos", accepted by CVPR 2024.☆49Updated this week
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆75Updated 5 months ago
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆38Updated 3 months ago
- Towards Evaluating the Robustness of Visual State Space Models☆21Updated this week
- FreeVA: Offline MLLM as Training-Free Video Assistant☆42Updated 3 months ago
- Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)☆62Updated 7 months ago
- [ECCV 2024] - Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation☆39Updated this week
- Simple PyTorch implementation of "Libra: Building Decoupled Vision System on Large Language Models" (accepted by ICML 2024)☆41Updated 3 months ago
- Official code repository of paper titled "Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Visio…☆17Updated last month
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆42Updated last year
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆31Updated last month
- LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft☆37Updated 2 months ago
- [ECCVW 2024] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".☆12Updated 3 weeks ago
- [InterSpeech 2024] Official code repository of paper titled "Bird Whisperer: Leveraging Large Pre-trained Acoustic Model for Bird Call Cl…☆27Updated last week
- 🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant☆47Updated last week
- Official Implementation of "Read-only Prompt Optimization for Vision-Language Few-shot Learning", ICCV 2023☆48Updated last year
- Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]☆104Updated last year
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆36Updated 9 months ago
- Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆44Updated 3 weeks ago
- Code for paper "AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention"☆13Updated 2 months ago