Hritikbansal/videocon

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Hritikbansal/videocon)

Hritikbansal / videocon

☆58

Alternatives and similar repositories for videocon

Users that are interested in videocon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 3 years ago
HanSolo9682 / CounterCurate
View on GitHub
This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.
☆19Jun 27, 2024Updated 2 years ago
RAIVNLab / CREPE
View on GitHub
[CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?
☆35Apr 27, 2023Updated 3 years ago
linzhiqiu / visual_gpt_score
View on GitHub
VisualGPTScore for visio-linguistic reasoning
☆27Oct 7, 2023Updated 2 years ago
renjie-liang / HUAL
View on GitHub
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
☆15Dec 12, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
aszala / VPEval
View on GitHub
VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)
☆45Nov 29, 2023Updated 2 years ago
Heidelberg-NLP / VALSE
View on GitHub
Data repository for the VALSE benchmark.
☆40Feb 15, 2024Updated 2 years ago
bpiyush / TestOfTime
View on GitHub
Official code for our CVPR 2023 paper: Test of Time: Instilling Video-Language Models with a Sense of Time
☆46Jun 11, 2024Updated 2 years ago
minjoong507 / BM-DETR
View on GitHub
[WACV 2025] Official Pytorch code for "Background-aware Moment Detection for Video Moment Retrieval"
☆16Feb 24, 2025Updated last year
HuiGuanLab / RaTSG
View on GitHub
This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"
☆13Aug 22, 2025Updated 11 months ago
arijitray1993 / COLA
View on GitHub
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆25May 14, 2026Updated 2 months ago
minjoong507 / Consistency-of-Video-LLM
View on GitHub
[CVPR 2025] Official Repository of the paper "On the Consistency of Video Large Language Models in Temporal Comprehension"
☆16Oct 13, 2025Updated 9 months ago
microsoft / LAVENDER
View on GitHub
A Unified Framework for Video-Language Understanding
☆62Jun 17, 2023Updated 3 years ago
adobe-research / llava-score
View on GitHub
☆11Oct 2, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
MikeWangWZHL / Paxion
View on GitHub
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
☆38May 23, 2023Updated 3 years ago
HJYao00 / Side4Video
View on GitHub
☆42Apr 7, 2024Updated 2 years ago
princetonvisualai / merv
View on GitHub
Unifying Specialized Visual Encoders for Video Language Models
☆25Nov 22, 2025Updated 8 months ago
bladewaltz1 / PromptSwitch
View on GitHub
☆30Aug 14, 2023Updated 2 years ago
YYJMJC / Compositional-Temporal-Grounding
View on GitHub
☆31Mar 24, 2022Updated 4 years ago
StelaBou / Diffusion-Act
View on GitHub
☆25Sep 5, 2025Updated 10 months ago
frostinassiky / bsp
View on GitHub
Placeholder for code of BSP.
☆11Aug 13, 2021Updated 4 years ago
fmthoker / SEVERE-BENCHMARK
View on GitHub
☆26Aug 31, 2023Updated 2 years ago
alibaba-mmai-research / HiCo
View on GitHub
CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency
☆18Aug 10, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
adxcreative / COPE
View on GitHub
☆15Dec 20, 2024Updated last year
ytaek-oh / retriever
View on GitHub
☆11Sep 15, 2023Updated 2 years ago
lucas-ventura / CoVR
View on GitHub
Official PyTorch implementation of the paper "CoVR: Learning Composed Video Retrieval from Web Video Captions".
☆119Apr 21, 2026Updated 3 months ago
google / video-localized-narratives
View on GitHub
☆60Aug 10, 2023Updated 2 years ago
Fodark / anonygan
View on GitHub
Official PyTorch implementation of the paper "Graph-based Generative Face Anonymisation with Pose Preservation" in ICIAP 2021
☆14Dec 13, 2021Updated 4 years ago
WangFei-2019 / SNARE
View on GitHub
Project for SNARE benchmark
☆11Jun 5, 2024Updated 2 years ago
zihuixue / seeAoT
View on GitHub
Code and data release for the paper "Seeing the Arrow of Time in Large Multimodal Models"
☆16Oct 2, 2025Updated 9 months ago
RAIVNLab / sugar-crepe
View on GitHub
[NeurIPS 2023] A faithful benchmark for vision-language compositionality
☆93Feb 13, 2024Updated 2 years ago
zhaoyue-zephyrus / AVION
View on GitHub
[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"
☆138Aug 23, 2025Updated 11 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ruc-aimc-lab / TeachCLIP
View on GitHub
[CVPR 2024] TeachCLIP for Text-to-Video Retrieval
☆42May 7, 2025Updated last year
nishadsinghi / CleanCLIP
View on GitHub
Official PyTorch implementation of "CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning" @ ICCV 2023
☆40Oct 16, 2025Updated 9 months ago
cliangyu / Cola
View on GitHub
[NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"
☆106Nov 9, 2023Updated 2 years ago
mertyg / vision-language-models-are-bows
View on GitHub
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …
☆294Jun 7, 2023Updated 3 years ago
adxcreative / D-M
View on GitHub
The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…
☆10Feb 9, 2025Updated last year
cambridgeltl / visual-spatial-reasoning
View on GitHub
[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.
☆149Mar 25, 2023Updated 3 years ago
Zi-hao-Wei / Efficient-Vision-Language-Pre-training-by-Cluster-Masking
View on GitHub
[CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.
☆33May 16, 2024Updated 2 years ago