dres-dev / DRESLinks
Distributed Retrieval Evaluation Server
☆16Updated last year
Alternatives and similar repositories for DRES
Users that are interested in DRES are comparing it to the libraries listed below
Sorting:
- Open-source release of the SOMHunter video retrieval tool☆24Updated 2 years ago
- Archive of Tasks and Results of the Video Browser Showdown☆13Updated 9 months ago
- [ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features☆189Updated 2 years ago
- An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"☆180Updated last year
- ☆191Updated last year
- [NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale☆201Updated 2 years ago
- [CVPR 2022 - Demo Track] - Effective conditioned and composed image retrieval combining CLIP-based features☆83Updated last year
- [CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".☆301Updated last year
- mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)☆228Updated 2 years ago
- ☆192Updated 10 months ago
- [NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset☆296Updated last year
- ☆33Updated 2 years ago
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆194Updated 5 months ago
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆28Updated 2 years ago
- PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models☆261Updated 5 months ago
- Towards Video Text Visual Question Answering: Benchmark and Baseline☆40Updated last year
- [TMM 2023] VideoXum: Cross-modal Visual and Textural Summarization of Videos☆53Updated last year
- Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"☆146Updated this week
- Official repository of "Chatting Makes Perfect: Chat-based Image Retrieval"☆30Updated 11 months ago
- [NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering☆194Updated last year
- [AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding☆121Updated last year
- GRiT: A Generative Region-to-text Transformer for Object Understanding (ECCV2024)☆338Updated 2 years ago
- EILeV: Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties☆131Updated last year
- Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding☆291Updated 5 months ago
- ☆87Updated last year
- Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"☆269Updated last year
- Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"☆91Updated 9 months ago
- [NeurIPS 2021] Moment-DETR code and QVHighlights dataset☆338Updated last year
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆66Updated last year
- ☆121Updated last year