Abhiram4572 / mi_bartLinks
☆12Updated 8 months ago
Alternatives and similar repositories for mi_bart
Users that are interested in mi_bart are comparing it to the libraries listed below
Sorting:
- ☆92Updated 2 years ago
- PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)☆371Updated last year
- image scene graph generation benchmark☆396Updated 2 years ago
- [NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models☆157Updated 7 months ago
- Recent Advances in Vision and Language Pre-training (VLP)☆292Updated 2 years ago
- Code accompanying paper "Fine-Grained Visual Entailment" [ECCV 2022].☆10Updated 2 years ago
- project page for VinVL☆356Updated last year
- Moment Detection in Long Tutorial Videos☆20Updated last year
- MERLOT: Multimodal Neural Script Knowledge Models☆224Updated 3 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆115Updated 2 years ago
- A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).☆584Updated last year
- [ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383☆413Updated 2 years ago
- ☆18Updated last year
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆135Updated 2 years ago
- This is the code of ECCV 2022 (Oral) paper "Fine-Grained Scene Graph Generation with Data Transfer".☆102Updated 2 years ago
- [CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias☆123Updated 3 years ago
- Reliably download millions of images efficiently☆116Updated 4 years ago
- PyTorch bottom-up attention with Detectron2☆233Updated 3 years ago
- ☆40Updated 2 years ago
- Official repository for the A-OKVQA dataset☆95Updated last year
- A PyTorch implementation for the paper: Fully Convolutional Scene Graph Generation, CVPR 2021☆29Updated 2 years ago
- An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)☆85Updated 3 years ago
- ☆38Updated 2 years ago
- Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Pe…☆126Updated last year
- A collections of papers about VQA-CP datasets and their results☆38Updated 3 years ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆161Updated 11 months ago
- ☆24Updated 4 years ago
- Grid features pre-training code for visual question answering☆269Updated 3 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188Updated 2 months ago
- MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions☆166Updated last year