dres-dev / DRES
Distributed Retrieval Evaluation Server
☆14Updated 3 months ago
Alternatives and similar repositories for DRES:
Users that are interested in DRES are comparing it to the libraries listed below
- Archive of Tasks and Results of the Video Browser Showdown☆11Updated 3 weeks ago
- Open-source release of the SOMHunter video retrieval tool☆21Updated last year
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆35Updated 6 months ago
- Use CLIP to represent video for Retrieval Task☆69Updated 3 years ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆132Updated last year
- [CVPR 2022 - Demo Track] - Effective conditioned and composed image retrieval combining CLIP-based features☆78Updated 3 months ago
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆24Updated last year
- Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answer…☆53Updated 3 months ago
- A paper list of image captioning.☆22Updated 2 years ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆56Updated last year
- A PyTorch implementation of VIOLET☆137Updated last year
- A Unified Framework for Video-Language Understanding☆56Updated last year
- An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"☆149Updated 10 months ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆28Updated last year
- Code for the Video Similarity Challenge.☆77Updated last year
- ☆89Updated last year
- Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).☆138Updated 2 years ago
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆52Updated 2 months ago
- [ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features☆172Updated last year
- TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]☆54Updated last year
- ☆106Updated 2 years ago
- [arXiv22] Disentangled Representation Learning for Text-Video Retrieval☆94Updated 2 years ago
- The 3rd Place Solution of the Meta AI Video Similarity Challenge : Descriptor Track and Matching Track.☆20Updated last year
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆134Updated last year
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆187Updated 2 years ago
- An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)☆85Updated 2 years ago
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆190Updated last year
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆188Updated last year
- An implementation of "CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model".☆130Updated last month
- ☆75Updated 2 years ago