facebookresearch / HierVLLinks

[CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings

☆46

Alternatives and similar repositories for HierVL

Users that are interested in HierVL are comparing it to the libraries listed below

Sorting:

facebookresearch / EgoT2
Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)
☆33Updated 2 years ago
facebookresearch / EgoVLPv2
Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]
☆99Updated last year
showlab / cosmo
☆72Updated last year
amazon-science / QA-ViT
☆69Updated last year
twelvelabs-io / video-embeddings-evaluation-framework
Pytorch implementation of Twelve Labs' Video Foundation Model evaluation framework & open embeddings
☆28Updated 11 months ago
Hritikbansal / videocon
☆58Updated last year
QUVA-Lab / PIN
Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
☆26Updated 6 months ago
yukw777 / EILEV
EILeV: Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties
☆128Updated 8 months ago
md-mohaiminul / ViS4mer
☆55Updated 3 years ago
facebookresearch / maws
Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496
☆91Updated 3 months ago
jmerullo / limber
https://arxiv.org/abs/2209.15162
☆50Updated 2 years ago
ilkerkesen / ViLMA
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
☆16Updated last year
facebookresearch / ProcedureVRL
[CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"
☆54Updated 2 years ago
google / video-localized-narratives
☆59Updated last year
kkahatapitiya / LangRepo
Language Repository for Long Video Understanding
☆32Updated last year
facebookresearch / CiT
Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".
☆78Updated 2 years ago
salesforce / paprika
Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"
☆50Updated 6 months ago
facebookresearch / VLaMP
Code for “Pretrained Language Models as Visual Planners for Human Assistance”
☆61Updated 2 years ago
Nicous20 / FunQA
FunQA benchmarks funny, creative, and magic videos for challenging tasks including timestamp localization, video description, reasoning, …
☆102Updated 8 months ago
OliverRensu / D-iGPT
[ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…
☆98Updated last year
md-mohaiminul / BIMBA
☆20Updated 2 weeks ago
facebookresearch / genecis
Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"
☆60Updated 2 years ago
Understanding-Visual-Datasets / VisDiff
Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)
☆120Updated last year
yangbang18 / MultiCapCLIP
(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
☆35Updated last year
shashankvkt / DoRA_ICLR24
This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …
☆90Updated last year
cliangyu / Cola
[NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"
☆105Updated last year
zhaoyue-zephyrus / AVION
[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"
☆133Updated last year
j-min / VPGen
Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)
☆56Updated 2 years ago
ninatu / howtocaption
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
☆55Updated 10 months ago
yukw777 / VideoBLIP
Supercharged BLIP-2 that can handle videos
☆120Updated last year