Research code for "Training Vision-Language Transformers from Captions Alone"
☆33Jul 15, 2022Updated 3 years ago
Alternatives and similar repositories for VLC
Users that are interested in VLC are comparing it to the libraries listed below
Sorting:
- Tools for the Parse-27k Dataset - evaluation routines and some simple scripts to get started...☆10Jul 16, 2016Updated 9 years ago
- ☆14May 3, 2022Updated 3 years ago
- This is an official implementation of GRIT-VLP☆20Aug 8, 2022Updated 3 years ago
- Stochastic Optimization for Global Contrastive Learning without Large Mini-batches☆20Mar 31, 2023Updated 2 years ago
- ☆45Oct 11, 2021Updated 4 years ago
- Code for the Ask4Help project☆22Nov 24, 2022Updated 3 years ago
- ☆20Mar 14, 2021Updated 4 years ago
- Official Code of ECCV 2022 paper MS-CLIP☆91Jul 27, 2022Updated 3 years ago
- [ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"☆49Dec 7, 2022Updated 3 years ago
- ☆25May 11, 2022Updated 3 years ago
- X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)☆492Nov 25, 2022Updated 3 years ago
- A Unified Framework for Video-Language Understanding☆61Jun 17, 2023Updated 2 years ago
- Visual Question Reasoning on General Dependency Tree☆30Updated this week
- This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …☆71Dec 20, 2021Updated 4 years ago
- An interactive demo based on Segment-Anything for stroke-based painting which enables human-like painting.☆35Apr 16, 2023Updated 2 years ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆198May 9, 2023Updated 2 years ago
- ☆131Dec 10, 2022Updated 3 years ago
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm☆675Sep 19, 2022Updated 3 years ago
- [CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》☆151Jun 7, 2023Updated 2 years ago
- Extract information from XBRL files in the ESEF format☆13Jan 3, 2026Updated last month
- Implementation of paper "Improving Image Captioning with Better Use of Caption"☆33Sep 15, 2020Updated 5 years ago
- PyTorch implementation of I3D model for video classification, mixed with the CRF smoothing layer for multi-label classification.☆31Jul 29, 2018Updated 7 years ago
- UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)☆89Jun 12, 2023Updated 2 years ago
- ☆39May 12, 2020Updated 5 years ago
- Localized Narratives☆86Sep 9, 2021Updated 4 years ago
- Phonetic Analysis ToolKIT - PATKIT - Python package for analysing phonetic data☆11Updated this week
- ☆11Dec 8, 2022Updated 3 years ago
- ☆39May 25, 2021Updated 4 years ago
- Walks through building different HTML5 layouts for AV systems☆12Oct 15, 2021Updated 4 years ago
- ☆10Dec 16, 2023Updated 2 years ago
- Matplotlib Image labeller for classifying images☆11Jan 5, 2026Updated last month
- Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"☆1,527Apr 3, 2024Updated last year
- Code release for SLIP Self-supervision meets Language-Image Pre-training☆787Feb 9, 2023Updated 3 years ago
- Official Code Release for Container : Context Aggregation Network☆46Oct 17, 2021Updated 4 years ago
- Object-aware Contrastive Learning for Debiased Scene Representation (NeurIPS 2021)☆45Oct 25, 2021Updated 4 years ago
- Official PyTorch implementation of "Large-scale Bilingual Language-Image Contrastive Learning" (ICLRW 2022)☆96Apr 13, 2022Updated 3 years ago
- PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)☆374Jul 29, 2023Updated 2 years ago
- Code for ALBEF: a new vision-language pre-training method☆1,754Sep 20, 2022Updated 3 years ago
- ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining☆97Nov 2, 2022Updated 3 years ago