guilk / VLCLinks

Research code for "Training Vision-Language Transformers from Captions Alone"

☆34

Alternatives and similar repositories for VLC

Users that are interested in VLC are comparing it to the libraries listed below

Sorting:

microsoft / LAVENDER
A Unified Framework for Video-Language Understanding
☆60Updated 2 years ago
microsoft / UniTAB
UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)
☆89Updated 2 years ago
zerovl / ZeroVL
[ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources
☆45Updated 3 years ago
princetonvisualai / pointingqa
Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"
☆19Updated 3 years ago
zhjohnchan / SK-VG
[CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.
☆32Updated 2 years ago
airsplay / vimpac
☆73Updated 3 years ago
easonnie / mlp-vil
MLPs for Vision and Langauge Modeling (Coming Soon)
☆27Updated 3 years ago
Deferf / CLIP_Video_Representation
Use CLIP to represent video for Retrieval Task
☆70Updated 4 years ago
yonatanbitton / data_efficient_masked_language_modeling_for_vision_and_language
Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".
☆18Updated 4 years ago
allenai / gpv2
☆32Updated 3 years ago
allenai / grit_official
Official repository for the General Robust Image Task (GRIT) Benchmark
☆54Updated 2 years ago
MILVLG / rosita
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
☆56Updated 2 years ago
intersun / LightningDOT
source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT
☆72Updated 3 years ago
allenai / gpv-1
A task-agnostic vision-language architecture as a step towards General Purpose Vision
☆92Updated 4 years ago
facebookresearch / diht
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
☆138Updated 2 years ago
zmykevin / UVLP
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
☆22Updated 3 years ago
TXH-mercury / COSA
[ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
☆43Updated 10 months ago
zinengtang / DeCEMBERT
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Updated 2 years ago
showlab / Region_Learner
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"
☆42Updated 3 years ago
LisaAnne / TemporalLanguageRelease
☆43Updated 4 years ago
facebookresearch / CiT
Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".
☆78Updated 2 years ago
redcaps-dataset / redcaps-downloader
Command-line tool for downloading and extending the RedCaps dataset.
☆50Updated last year
ChenDelong1999 / polite-flamingo
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
☆64Updated last year
medhini / clip_it
CLIP-It! Language-Guided Video Summarization
☆75Updated 4 years ago
mshukor / ViCHA
[BMVC22] Official Implementation of ViCHA: "Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment"
☆55Updated 3 years ago
StanLei52 / TQVSR
[Findings of EMNLP 2022] AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant
☆23Updated 2 years ago
facebookresearch / video-distant-supervision
This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, w…
☆43Updated 2 years ago
lupantech / IconQA
Data and code for NeurIPS 2021 Paper "IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning".
☆53Updated last year
TencentARC / GVT
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
☆58Updated 2 years ago
MikeWangWZHL / VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆115Updated 3 years ago