starmemda / CAMoE
☆98Updated 3 years ago
Alternatives and similar repositories for CAMoE:
Users that are interested in CAMoE are comparing it to the libraries listed below
- 💐Kaleido-BERT: Vision-Language Pre-training on Fashion Domain☆264Updated 2 years ago
- Starter Code for VALUE benchmark☆80Updated 2 years ago
- [arXiv22] Disentangled Representation Learning for Text-Video Retrieval☆94Updated 2 years ago
- [SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval. Also, a text-video retrieval toolbox based on CLIP + fast p…☆128Updated 2 years ago
- Official PyTorch implementation of the “Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation” (ICCV 2021)☆57Updated 3 years ago
- Graph Contrastive Clustering (ICCV2021)☆88Updated 2 years ago
- Code and benchmarks for the Semantic Video Retrieval Task☆54Updated 2 years ago
- Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)☆128Updated 3 years ago
- Improving One-stage Visual Grounding by Recursive Sub-query Construction, ECCV 2020☆84Updated 3 years ago
- An optimized re-implementation for 2D-TAN: Learning 2D Temporal Localization Networks for Moment Localization with Natural Language (AAAI…☆126Updated last year
- A PyTorch implementation of VIOLET☆137Updated last year
- [ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos☆118Updated last year
- Cross Modal Retrieval with Querybank Normalisation☆55Updated last year
- IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning☆79Updated 4 years ago
- [CVPR2023] All in One: Exploring Unified Video-Language Pre-training☆280Updated last year
- [CVPR 2022] The code for our paper 《Object-aware Video-language Pre-training for Retrieval》☆62Updated 2 years ago
- Official codebase for "Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding"☆21Updated 4 years ago
- https://layer6ai-labs.github.io/xpool/☆118Updated last year
- The HC-STVG Dataset☆55Updated last year
- Learning phrase grounding from captioned images through InfoNCE bound on mutual information☆72Updated 4 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆42Updated 2 years ago
- Learning Spatiotemporal Features via Video and Text Pair Discrimination☆59Updated 4 years ago
- Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"☆232Updated 3 years ago
- Official Pytorch implementations of MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition (ICCV 2023)☆54Updated last year
- An easy PyTorch implement of SlowFast-Network☆98Updated 5 years ago
- Code for the paper "Zero-shot Natural Language Video Localization" (ICCV2021, Oral).☆47Updated last year
- Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware P…☆60Updated last year
- ☆239Updated 2 years ago
- Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos☆86Updated 4 years ago
- Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)☆55Updated 3 years ago