yonatanbitton/data_efficient_masked_language_modeling_for_vision_and_language

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yonatanbitton/data_efficient_masked_language_modeling_for_vision_and_language)

yonatanbitton / data_efficient_masked_language_modeling_for_vision_and_language

Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".

☆18

Alternatives and similar repositories for data_efficient_masked_language_modeling_for_vision_and_language

Users that are interested in data_efficient_masked_language_modeling_for_vision_and_language are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

luo3300612 / Transformer-Captioning
View on GitHub
Optimized code based on M2 for faster image captioning training
☆21Nov 18, 2022Updated 3 years ago
LeeYN-43 / Clover
View on GitHub
Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)
☆39Feb 15, 2023Updated 3 years ago
kakaobrain / noc
View on GitHub
☆47Apr 29, 2024Updated 2 years ago
antoyang / just-ask
View on GitHub
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
☆127Sep 29, 2023Updated 2 years ago
LooperXX / ManagerTower
View on GitHub
Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
☆12Aug 23, 2025Updated 11 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
jiyounglee-0523 / FourierDecoder
View on GitHub
Official repository for Fourier model that can generate periodic signals
☆10Mar 10, 2022Updated 4 years ago
princetonvisualai / SPICE-U
View on GitHub
☆11Sep 7, 2020Updated 5 years ago
princetonvisualai / imagecaptioning-bias
View on GitHub
Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"
☆12Mar 26, 2026Updated 3 months ago
zinengtang / VidLanKD
View on GitHub
Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))
☆56Feb 6, 2023Updated 3 years ago
UCSB-AI / Mitigate-Gender-Bias-in-Image-Search
View on GitHub
Code for the EMNLP 2021 Oral paper "Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias in Image Search" https://arx…
☆12Feb 6, 2023Updated 3 years ago
fenglinliu98 / MIA
View on GitHub
Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" （NeurIPS 2019）
☆65Oct 19, 2020Updated 5 years ago
multimodal / multimodal
View on GitHub
A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"
☆83Feb 25, 2022Updated 4 years ago
feifeibear / PyTorchMemTracer
View on GitHub
Depict GPU memory footprint during DNN training of PyTorch
☆11Nov 17, 2022Updated 3 years ago
ShiYaya / emscore
View on GitHub
Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"
☆26Oct 20, 2022Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
woodfrog / vse_infty
View on GitHub
Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)
☆165Aug 24, 2025Updated 11 months ago
yxuansu / TaCL
View on GitHub
[NAACL'22] TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning
☆94Jun 8, 2022Updated 4 years ago
facebookresearch / conversational-voice-capture
View on GitHub
A general purpose web app for connecting participants to engage in realtime conversations based on generated prompts.
☆21Jun 21, 2023Updated 3 years ago
phelps-matthew / FeatherMap
View on GitHub
Implementation of "Structured Multi-Hashing for Model Compression" (CVPR 2020)
☆12Feb 18, 2021Updated 5 years ago
Sense-GVT / BigPretrain
View on GitHub
A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)
☆15Oct 18, 2021Updated 4 years ago
yytzsy / SMCG
View on GitHub
Code for the paper "Controllable Video Captioning with an Exemplar Sentence"
☆12Apr 14, 2021Updated 5 years ago
jaeseokbyun / GRIT-VLP
View on GitHub
This is an official implementation of GRIT-VLP
☆20Aug 8, 2022Updated 3 years ago
H-TayyarMadabushi / SemEval_2022_Task2-idiomaticity
View on GitHub
Data and preprocessing scripts for SemEval 2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding
☆16Feb 3, 2022Updated 4 years ago
kangyeolk / Naver-Webtoon-Talking-Head
View on GitHub
Funny Application of Neural Head Reenactment to Naver Webtoon
☆10Mar 22, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
VMatrixTeam / open-matrix
View on GitHub
open source version of matrix online programming learning system
☆13Dec 15, 2016Updated 9 years ago
rentainhe / TRAR-VQA
View on GitHub
[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
☆68Oct 11, 2021Updated 4 years ago
Optimization-AI / SogCLR
View on GitHub
Stochastic Optimization for Global Contrastive Learning without Large Mini-batches
☆20Mar 31, 2023Updated 3 years ago
om-ai-lab / VL-CheckList
View on GitHub
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
☆138Apr 10, 2026Updated 3 months ago
VITA-Group / AsViT
View on GitHub
[ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…
☆76Feb 21, 2022Updated 4 years ago
ChenyuHeidiZhang / VL-commonsense
View on GitHub
☆14May 23, 2022Updated 4 years ago
juletx / spatial-reasoning
View on GitHub
Grounding Language Models for Compositional and Spatial Reasoning
☆18Oct 26, 2022Updated 3 years ago
sail-sg / ptp
View on GitHub
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
☆150Jun 7, 2023Updated 3 years ago
TerminologyHub / termhub-in-5-minutes
View on GitHub
Developer project for getting basic API integrations working in under 5 minutes
☆11May 22, 2026Updated 2 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
mhh0318 / UniD3
View on GitHub
☆55Feb 9, 2023Updated 3 years ago
snu-mllab / Co-Mixup
View on GitHub
Official PyTorch implementation of "Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity" (ICLR'21 Oral)
☆107Dec 2, 2021Updated 4 years ago
antoine77340 / MIL-NCE_HowTo100M
View on GitHub
PyTorch GPU distributed training code for MIL-NCE HowTo100M
☆221Jul 5, 2022Updated 4 years ago
zinengtang / DeCEMBERT
View on GitHub
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Jan 12, 2023Updated 3 years ago
facebookresearch / reliable_vqa
View on GitHub
Implementation for the paper "Reliable Visual Question Answering Abstain Rather Than Answer Incorrectly" (ECCV 2022: https//arxiv.org/abs…
☆41May 19, 2023Updated 3 years ago
naver-ai / seit
View on GitHub
[ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT
☆56Aug 12, 2024Updated last year
facebookresearch / data2vec_vision
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆81Jan 7, 2026Updated 6 months ago