furkanbiten / object-biasLinks

Let there be clock in the beach - WACV 2022

☆15

Alternatives and similar repositories for object-bias

Users that are interested in object-bias are comparing it to the libraries listed below

Sorting:

sushizixin / CLIP4IDC
CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)
☆36Updated 3 years ago
zinengtang / DeCEMBERT
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Updated 2 years ago
zmykevin / UVLP
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
☆22Updated 3 years ago
showlab / Region_Learner
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"
☆42Updated 3 years ago
mad-red / VSR-guided-CIC
Human-like Controllable Image Captioning with Verb-specific Semantic Roles.
☆36Updated 3 years ago
CuthbertCai / Ask-Confirm
Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)
☆20Updated 3 years ago
baaaad / ECE
[ECCV'22 Poster] Explicit Image Caption Editing
☆22Updated 3 years ago
yashkant / sam-textvqa
Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.
☆65Updated 4 years ago
MILVLG / rosita
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
☆56Updated 2 years ago
visinf / cos-cvae
Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)
☆37Updated 3 years ago
SpencerWhitehead / novelvqa
☆27Updated 4 years ago
guanghuixu / CRN_tvqa
☆15Updated 5 years ago
google-deepmind / svo_probes
The SVO-Probes Dataset for Verb Understanding
☆31Updated 3 years ago
thunlp / VisualDS
☆25Updated 3 years ago
jwehrmann / lavse
Language-Agnostic Visual-Semantic Embeddings (ICCV'19)
☆22Updated 6 years ago
adobe-research / vaw_dataset
This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in th…
☆68Updated 3 years ago
guanghuixu / AnchorCaptioner
☆31Updated 4 years ago
Cuberick-Orion / CIRPLANT
Official implementation of the Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) | ICCV 2021 - Image Retrieval o…
☆40Updated last year
chenxy99 / SD-FSIC
Official code for the paper "Self-Distillation for Few-Shot Image Captioning"
☆15Updated 4 years ago
YuanEZhou / Grounded-Image-Captioning
☆64Updated 3 years ago
guilk / VLC
Research code for "Training Vision-Language Transformers from Captions Alone"
☆34Updated 3 years ago
ShiYaya / emscore
Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"
☆26Updated 3 years ago
wzk1015 / CNMT
[AAAI 2021] Confidence-aware Non-repetitive Multimodal Transformers for TextCaps
☆24Updated 2 years ago
easonnie / mlp-vil
MLPs for Vision and Langauge Modeling (Coming Soon)
☆27Updated 3 years ago
microsoft / UniTAB
UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)
☆89Updated 2 years ago
princetonvisualai / pointingqa
Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"
☆19Updated 3 years ago
szzexpoi / POEM
Official Implementation for CVPR 2023 paper "Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasonin…
☆10Updated last year
Kien085 / SG2Caps
☆23Updated 4 years ago
VALUE-Leaderboard / DataRelease
Data Release for VALUE Benchmark
☆30Updated 3 years ago
AndresPMD / StacMR
Scene Text Aware Cross Modal Retrieval (StacMR)
☆24Updated 4 years ago