Use CLIP to represent video for Retrieval Task
☆70Mar 1, 2021Updated 4 years ago
Alternatives and similar repositories for CLIP_Video_Representation
Users that are interested in CLIP_Video_Representation are comparing it to the libraries listed below
Sorting:
- MDMMT: Multidomain Multimodal Transformer for Video Retrieval☆26Jun 28, 2021Updated 4 years ago
- ☆259Dec 10, 2022Updated 3 years ago
- The top conferences on video retrieval libraries in recent years, synchronized with my blog.☆14Nov 27, 2021Updated 4 years ago
- [SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval.☆134May 4, 2022Updated 3 years ago
- A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)☆15Oct 18, 2021Updated 4 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188May 1, 2025Updated 9 months ago
- An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"☆1,025Apr 12, 2024Updated last year
- Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]☆380May 19, 2022Updated 3 years ago
- [CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…☆723Aug 8, 2023Updated 2 years ago
- Video-Text Representation Learning via Differentiable Weak Temporal Alignment (PyTorch implementation for the CVPR 2022 paper)☆11Oct 12, 2022Updated 3 years ago
- Video embeddings for retrieval with natural language queries☆342Feb 15, 2023Updated 3 years ago
- An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"☆364Jul 25, 2024Updated last year
- Code and benchmarks for the Semantic Video Retrieval Task☆53Oct 18, 2022Updated 3 years ago
- Dataset for Bilingual VLN☆11Dec 5, 2020Updated 5 years ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆27Aug 20, 2022Updated 3 years ago
- PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models☆34Jan 14, 2026Updated last month
- Pytorch implementation for our NeurIPS 2019 paper "TAB-VCR: Tags and Attributes based VCR Baselines" https://arxiv.org/abs/1910.14671☆18May 6, 2021Updated 4 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))☆56Feb 6, 2023Updated 3 years ago
- Vision Longformer For Object Detection☆34May 17, 2021Updated 4 years ago
- [CVPR2019] Dual Encoding for Zero-Example Video Retrieval☆153Jan 10, 2023Updated 3 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆209Dec 18, 2022Updated 3 years ago
- [arXiv22] Disentangled Representation Learning for Text-Video Retrieval☆98Apr 7, 2022Updated 3 years ago
- COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning☆291Sep 6, 2022Updated 3 years ago
- CLIP-It! Language-Guided Video Summarization☆75Jun 21, 2021Updated 4 years ago
- Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".☆211Jun 12, 2020Updated 5 years ago
- Multi-Modal Transformer for Video Retrieval☆265Oct 9, 2024Updated last year
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 4 years ago
- [CVPR2023] All in One: Exploring Unified Video-Language Pre-training☆281Mar 25, 2023Updated 2 years ago
- [CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations☆565Aug 22, 2025Updated 6 months ago
- The Conceptual Coverage Across Languages Benchmark for Text-to-Image Models☆12Oct 28, 2024Updated last year
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆136May 5, 2023Updated 2 years ago
- ☆47Apr 29, 2024Updated last year
- ☆82Jul 31, 2023Updated 2 years ago
- Code release for SLIP Self-supervision meets Language-Image Pre-training☆787Feb 9, 2023Updated 3 years ago
- A PyTorch implementation of VIOLET☆140Dec 17, 2023Updated 2 years ago
- PyTorch GPU distributed training code for MIL-NCE HowTo100M☆219Jul 5, 2022Updated 3 years ago
- W2VV++: A fully deep learning solution for ad-hoc video search☆29Jul 25, 2024Updated last year
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago