unitaryai / VTCLinks

VTC: Improving Video-Text Retrieval with User Comments

☆11

Alternatives and similar repositories for VTC

Users that are interested in VTC are comparing it to the libraries listed below

Sorting:

facebookresearch / genecis
Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"
☆58Updated last year
Computer-Vision-in-the-Wild / Elevater_Toolkit_IC
Toolkit for Elevater Benchmark
☆72Updated last year
arijitray1993 / COLA
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆24Updated 6 months ago
redcaps-dataset / redcaps-downloader
Command-line tool for downloading and extending the RedCaps dataset.
☆47Updated last year
naver-ai / augsub
[CVPR 2025] Official PyTorch implementation of MaskSub "Masking meets Supervision: A Strong Learning Alliance"
☆38Updated 2 months ago
NVlabs / PALAVRA
☆50Updated 2 years ago
codezakh / LilT
[ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning
☆39Updated last year
UCSC-VLAA / CLIPS
An Enhanced CLIP Framework for Learning with Synthetic Captions
☆34Updated last month
jeykigung / HiCLIP
☆29Updated 2 years ago
showlab / datacentric.vlp
Compress conventional Vision-Language Pre-training data
☆51Updated last year
naver-ai / prolip
☆50Updated 2 months ago
microsoft / LAVENDER
A Unified Framework for Video-Language Understanding
☆57Updated last year
naver-ai / seit
[ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT
☆55Updated 9 months ago
Hritikbansal / videocon
☆57Updated last year
navervision / lincir
Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)
☆134Updated 10 months ago
miccunifi / CIRCO
[ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset
☆66Updated 9 months ago
kakaobrain / noc
☆46Updated last year
naver-ai / eccv-caption
Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)
☆56Updated last year
facebookresearch / diht
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
☆137Updated 2 years ago
joslefaure / HERMES
[ECCVW'24] Long-form Video Understanding by Bridging Episodic Memory and Semantic Knowledge
☆27Updated 8 months ago
RAIVNLab / CREPE
[CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?
☆32Updated 2 years ago
hammoudhasan / SynthCLIP
Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.
☆100Updated 2 months ago
Cuberick-Orion / CIRR
Official repository of ICCV 2021 - Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models
☆112Updated 2 weeks ago
postBG / CosMo.pytorch
Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback presented in CVPR 2021.
☆66Updated 2 years ago
mcahny / rovit
RO-ViT CVPR 2023 "Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers"
☆18Updated last year
aimagelab / pacscore
[CVPR 2023] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
☆61Updated 3 months ago
mshukor / ViCHA
[BMVC22] Official Implementation of ViCHA: "Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment"
☆55Updated 2 years ago
mlfoundations / imagenet-captions
Release of ImageNet-Captions
☆48Updated 2 years ago
SivanDoveh / TSVLC
Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models
☆46Updated last year
fmthoker / SEVERE-BENCHMARK
☆26Updated last year