yonatanbitton / data_efficient_masked_language_modeling_for_vision_and_languageView external linksLinks
Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".
☆18Sep 17, 2021Updated 4 years ago
Alternatives and similar repositories for data_efficient_masked_language_modeling_for_vision_and_language
Users that are interested in data_efficient_masked_language_modeling_for_vision_and_language are comparing it to the libraries listed below
Sorting:
- Funny Application of Neural Head Reenactment to Naver Webtoon☆10Mar 22, 2021Updated 4 years ago
- Official repository for Fourier model that can generate periodic signals☆10Mar 10, 2022Updated 3 years ago
- A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)☆15Oct 18, 2021Updated 4 years ago
- ☆47Apr 29, 2024Updated last year
- Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"☆12Oct 19, 2021Updated 4 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 5 months ago
- Implementation of "Structured Multi-Hashing for Model Compression" (CVPR 2020)☆12Feb 18, 2021Updated 4 years ago
- Code for the paper "Controllable Video Captioning with an Exemplar Sentence"☆12Apr 14, 2021Updated 4 years ago
- Optimized code based on M2 for faster image captioning training☆21Nov 18, 2022Updated 3 years ago
- A general purpose web app for connecting participants to engage in realtime conversations based on generated prompts.☆20Jun 21, 2023Updated 2 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)☆164Aug 24, 2025Updated 5 months ago
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Feb 15, 2023Updated 2 years ago
- ☆19Nov 22, 2022Updated 3 years ago
- "Describing Textures using Natural Language" code and data, ECCV 2020 Oral.☆17Aug 6, 2020Updated 5 years ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆80Jan 7, 2026Updated last month
- A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆84Feb 25, 2022Updated 3 years ago
- This is an official implementation of GRIT-VLP☆20Aug 8, 2022Updated 3 years ago
- PyTorch implementation of paper "Flat Metric Minimization with Applications in Generative Modeling"☆19May 14, 2019Updated 6 years ago
- Stochastic Optimization for Global Contrastive Learning without Large Mini-batches☆20Mar 31, 2023Updated 2 years ago
- ☆19Oct 3, 2023Updated 2 years ago
- Release of ImageNet-Captions☆51Jan 20, 2023Updated 3 years ago
- Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))☆56Feb 6, 2023Updated 3 years ago
- [NAACL 2022] GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers☆21May 16, 2023Updated 2 years ago
- Paper Today I Read☆27Jan 27, 2026Updated 2 weeks ago
- This is the official repo for "MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment"☆17May 27, 2019Updated 6 years ago
- [CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》☆151Jun 7, 2023Updated 2 years ago
- https://arxiv.org/abs/2209.15162☆53Jan 24, 2023Updated 3 years ago
- [CVPRW'23] The official PyTorch implementation of NamedMask☆23Jun 12, 2023Updated 2 years ago
- PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision☆46Jul 29, 2020Updated 5 years ago
- ☆55Feb 9, 2023Updated 3 years ago
- Code for "Recognizing Scenes from Novel Viewpoints"☆29Sep 16, 2022Updated 3 years ago
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆60May 26, 2024Updated last year
- Code for "The Box Size Confidence Bias Harms Your Object Detector" (https://arxiv.org/abs/2112.01901)☆27Mar 27, 2023Updated 2 years ago
- Code for Greedy Gradient Ensemble for Visual Question Answering (ICCV 2021, Oral)☆27Mar 28, 2022Updated 3 years ago
- Code for T-MARS data filtering☆35Aug 23, 2023Updated 2 years ago
- Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers☆26Apr 12, 2022Updated 3 years ago
- Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.☆34Feb 13, 2025Updated last year
- PyTorch implementation of the Region Mutual Information Loss for Semantic Segmentation.☆26Oct 26, 2023Updated 2 years ago