MikeWangWZHL / VidILView external linksLinks
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆116Sep 15, 2022Updated 3 years ago
Alternatives and similar repositories for VidIL
Users that are interested in VidIL are comparing it to the libraries listed below
Sorting:
- Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight☆37May 23, 2023Updated 2 years ago
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆26Oct 20, 2022Updated 3 years ago
- ☆12Mar 12, 2023Updated 2 years ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆136May 5, 2023Updated 2 years ago
- PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)☆374Jul 29, 2023Updated 2 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188May 1, 2025Updated 9 months ago
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 3 years ago
- ☆11Nov 5, 2024Updated last year
- [ECCV-2022]Grounding Visual Representations with Texts for Domain Generalization☆31Apr 7, 2023Updated 2 years ago
- The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding☆64Mar 9, 2022Updated 3 years ago
- ☆181Aug 20, 2022Updated 3 years ago
- [NeurIPS 2022] Egocentric Video-Language Pretraining☆254May 9, 2024Updated last year
- MERLOT: Multimodal Neural Script Knowledge Models☆225Mar 15, 2022Updated 3 years ago
- [EMNLP'22] Weakly-Supervised Temporal Article Grounding☆14Nov 25, 2023Updated 2 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆209Dec 18, 2022Updated 3 years ago
- [NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models☆158Dec 9, 2024Updated last year
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering☆195Jan 14, 2024Updated 2 years ago
- A PyTorch implementation of VIOLET☆140Dec 17, 2023Updated 2 years ago
- A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.☆41Jun 29, 2022Updated 3 years ago
- Official Code of ECCV 2022 paper MS-CLIP☆91Jul 27, 2022Updated 3 years ago
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Jun 20, 2023Updated 2 years ago
- ☆193Oct 22, 2022Updated 3 years ago
- Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"☆146Jun 1, 2022Updated 3 years ago
- [ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383☆421Oct 28, 2022Updated 3 years ago
- ☆47Apr 29, 2024Updated last year
- Official repo for CVPR 2022 (Oral) paper: Revisiting the "Video" in Video-Language Understanding. Contains code for the Atemporal Probe (…☆51May 29, 2024Updated last year
- A Unified Framework for Video-Language Understanding☆61Jun 17, 2023Updated 2 years ago
- METER: A Multimodal End-to-end TransformER Framework☆375Nov 16, 2022Updated 3 years ago
- [CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers☆192Sep 24, 2023Updated 2 years ago
- Language Models Can See: Plugging Visual Controls in Text Generation☆259Jun 1, 2022Updated 3 years ago
- Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]☆380May 19, 2022Updated 3 years ago
- [CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.☆119Oct 9, 2023Updated 2 years ago
- CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training☆34Nov 9, 2021Updated 4 years ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Mar 4, 2024Updated last year
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆42Mar 11, 2025Updated 11 months ago
- Starter Code for VALUE benchmark☆80Aug 23, 2022Updated 3 years ago
- A pytorch implemetation of data augmentation method for visual question answering☆21May 25, 2023Updated 2 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆27Aug 20, 2022Updated 3 years ago