[ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer
☆37Oct 18, 2023Updated 2 years ago
Alternatives and similar repositories for Tem-adapter
Users that are interested in Tem-adapter are comparing it to the libraries listed below
Sorting:
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Jun 20, 2023Updated 2 years ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25Nov 23, 2024Updated last year
- LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections (NeurIPS 2023)☆29Dec 27, 2023Updated 2 years ago
- Learning Situation Hyper-Graphs for Video Question Answering☆22Feb 16, 2024Updated 2 years ago
- A curated list of zero-shot captioning papers☆24Aug 26, 2023Updated 2 years ago
- Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".☆118Oct 9, 2025Updated 4 months ago
- ☆12Dec 15, 2023Updated 2 years ago
- [ICLR 2023] CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding☆46Jun 9, 2025Updated 8 months ago
- Counterfactual Reasoning VQA Dataset☆28Nov 23, 2023Updated 2 years ago
- [ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"☆53Sep 21, 2023Updated 2 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆116Sep 15, 2022Updated 3 years ago
- ☆15Nov 30, 2023Updated 2 years ago
- Retrieval-augmented Image Captioning☆13Feb 16, 2023Updated 3 years ago
- [NeurIPS 2023] The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" acce…☆27May 14, 2024Updated last year
- Code for Static and Dynamic Concepts for Self-supervised Video Representation Learning.☆11Jul 28, 2022Updated 3 years ago
- [CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events☆65Feb 9, 2026Updated 3 weeks ago
- Codes for our ACM MM 2019 paper: "Exploiting Temporal Relationships in Video Moment Localization with Natural Language"☆16Oct 22, 2022Updated 3 years ago
- [ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts☆14Jan 13, 2025Updated last year
- FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models☆32Nov 27, 2025Updated 3 months ago
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- ☆101Oct 19, 2022Updated 3 years ago
- Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".☆71Feb 28, 2024Updated 2 years ago
- ☆37Sep 16, 2024Updated last year
- [CVPR2023] All in One: Exploring Unified Video-Language Pre-training☆281Mar 25, 2023Updated 2 years ago
- [ICCV2023 Oral] Implicit Temporal Modeling with Learnable Alignment for Video Recognition☆41Nov 29, 2023Updated 2 years ago
- [IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering☆20Jul 6, 2023Updated 2 years ago
- ☆17Jun 14, 2025Updated 8 months ago
- An official implementation for MS-DETR in ACL'23☆17Jun 3, 2023Updated 2 years ago
- [CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval☆38Feb 28, 2023Updated 3 years ago
- [ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot Temporal Action Detection via Vision-Language Prompting "☆112Aug 3, 2023Updated 2 years ago
- Evaluate robustness of adaptation methods on large vision-language models☆19Aug 23, 2023Updated 2 years ago
- LAVIS - A One-stop Library for Language-Vision Intelligence☆48Aug 5, 2024Updated last year
- Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"☆75Sep 20, 2023Updated 2 years ago
- Implementation for the journal paper "DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering" (Jianyu et al., IEEE Tran…☆18Jun 22, 2021Updated 4 years ago
- Understanding Self-Supervised Learning in a non-IID Setting☆21Oct 21, 2022Updated 3 years ago
- LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft☆46Jul 17, 2024Updated last year
- ☆20Feb 27, 2023Updated 3 years ago
- Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text☆24Aug 15, 2022Updated 3 years ago
- ☆19Dec 6, 2023Updated 2 years ago