tsujuifu/pytorch_violet

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tsujuifu/pytorch_violet)

tsujuifu / pytorch_violet

A PyTorch implementation of VIOLET

☆138

Alternatives and similar repositories for pytorch_violet

Users that are interested in pytorch_violet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

salesforce / ALPRO
View on GitHub
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
☆188May 1, 2025Updated last year
microsoft / LAVENDER
View on GitHub
A Unified Framework for Video-Language Understanding
☆62Jun 17, 2023Updated 3 years ago
antoyang / just-ask
View on GitHub
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
☆127Sep 29, 2023Updated 2 years ago
rowanz / merlot
View on GitHub
MERLOT: Multimodal Neural Script Knowledge Models
☆226Mar 15, 2022Updated 4 years ago
showlab / DemoVLP
View on GitHub
[Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training
☆22Mar 19, 2022Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
showlab / all-in-one
View on GitHub
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
☆281Mar 25, 2023Updated 3 years ago
showlab / Region_Learner
View on GitHub
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"
☆43Jul 15, 2022Updated 4 years ago
tsujuifu / pytorch_empirical-mvm
View on GitHub
A PyTorch implementation of EmpiricalMVM
☆41Dec 18, 2023Updated 2 years ago
VALUE-Leaderboard / StarterCode
View on GitHub
Starter Code for VALUE benchmark
☆79Aug 23, 2022Updated 3 years ago
jayleicn / singularity
View on GitHub
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
☆136May 5, 2023Updated 3 years ago
linjieli222 / HERO
View on GitHub
Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
☆235Sep 16, 2021Updated 4 years ago
jayleicn / ClipBERT
View on GitHub
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…
☆730Aug 8, 2023Updated 2 years ago
m-bain / frozen-in-time
View on GitHub
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
☆376May 19, 2022Updated 4 years ago
microsoft / UniVL
View on GitHub
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
☆366Jul 25, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
InterDigitalInc / DialogSummary-VideoQA
View on GitHub
☆10Mar 30, 2022Updated 4 years ago
VALUE-Leaderboard / DataRelease
View on GitHub
Data Release for VALUE Benchmark
☆30Feb 16, 2022Updated 4 years ago
FingerRec / OA-Transformer
View on GitHub
[CVPR 2022] The code for our paper 《Object-aware Video-language Pre-training for Retrieval》
☆61May 25, 2022Updated 4 years ago
jayleicn / TVQAplus
View on GitHub
[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
☆132Oct 25, 2022Updated 3 years ago
noagarcia / knowit-rock
View on GitHub
ROCK model for Knowledge-Based VQA in Videos
☆31Oct 19, 2020Updated 5 years ago
antoine77340 / MIL-NCE_HowTo100M
View on GitHub
PyTorch GPU distributed training code for MIL-NCE HowTo100M
☆221Jul 5, 2022Updated 4 years ago
jayleicn / mTVRetrieval
View on GitHub
[ACL 2021] mTVR: Multilingual Video Moment Retrieval
☆27Aug 20, 2022Updated 3 years ago
LandyGuo / Download_HowTo100M
View on GitHub
code for downloading videos from HowTo100M dataset
☆18May 13, 2021Updated 5 years ago
thaolmk54 / hcrn-videoqa
View on GitHub
Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)
☆135Jul 25, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
ych133 / How2R-and-How2QA
View on GitHub
A video retrieval dataset How2R and a video QA dataset How2QA
☆24Oct 15, 2020Updated 5 years ago
TencentARC / MCQ
View on GitHub
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
☆141Jul 20, 2022Updated 4 years ago
zinengtang / Perceiver_VL
View on GitHub
PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)
☆34Feb 5, 2023Updated 3 years ago
ylsung / VL_adapter
View on GitHub
PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)
☆212Dec 18, 2022Updated 3 years ago
Trunpm / TPT-for-VideoQA
View on GitHub
☆19Nov 25, 2022Updated 3 years ago
jayleicn / moment_detr
View on GitHub
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
☆349Mar 9, 2026Updated 4 months ago
antoine77340 / howto100m
View on GitHub
Code for the HowTo100M paper
☆304Mar 10, 2020Updated 6 years ago
MikeWangWZHL / VidIL
View on GitHub
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆117Sep 15, 2022Updated 3 years ago
ArrowLuo / CLIP4Clip
View on GitHub
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
☆1,030Apr 12, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ioanacroi / qb-norm
View on GitHub
Cross Modal Retrieval with Querybank Normalisation
☆57Nov 21, 2023Updated 2 years ago
rowanz / merlot_reserve
View on GitHub
Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"
☆146Jun 1, 2022Updated 4 years ago
NJUPT-MCC / DualVGR-VideoQA
View on GitHub
Implementation for the journal paper "DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering" (Jianyu et al., IEEE Tran…
☆18Jun 22, 2021Updated 5 years ago
airsplay / vimpac
View on GitHub
☆73Jun 3, 2022Updated 4 years ago
hyounghk / VideoQADenseCapFrameGate-ACL2020
View on GitHub
Code for ACL 2020 paper "Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA." Hyounghun Kim, Zineng T…
☆34May 14, 2020Updated 6 years ago
albanie / collaborative-experts
View on GitHub
Video embeddings for retrieval with natural language queries
☆344Feb 15, 2023Updated 3 years ago
tsujuifu / pytorch_tvc
View on GitHub
A PyTorch implementation of TVC
☆24Dec 18, 2023Updated 2 years ago