Update 2020
☆74Mar 21, 2022Updated 4 years ago
Alternatives and similar repositories for VQA_to_multimodal_survey
Users that are interested in VQA_to_multimodal_survey are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Codebase of 'From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model'☆45May 10, 2026Updated last month
- Shows visual grounding methods can be right for the wrong reasons! (ACL 2020)☆23Jun 26, 2020Updated 5 years ago
- A collections of papers about VQA-CP datasets and their results☆42Mar 18, 2022Updated 4 years ago
- Code for "Time-Aware Auto White Balance in Mobile Photography"☆28Jan 25, 2026Updated 4 months ago
- [Paper][ISWC 2021] Zero-shot Visual Question Answering using Knowledge Graph☆72Feb 9, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CMIVQA☆18Jun 3, 2024Updated 2 years ago
- PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)☆27Oct 13, 2022Updated 3 years ago
- CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning☆38Mar 21, 2025Updated last year
- Code for our ACL2021 paper: "Check It Again: Progressive Visual Question Answering via Visual Entailment"☆31Nov 24, 2021Updated 4 years ago
- ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration☆57Jun 13, 2023Updated 3 years ago
- ☆22Aug 10, 2020Updated 5 years ago
- Unofficial reimplementation of Dynamic Fusion with Intra- and Inter-modality Attention Flow for Visual Question Answering☆18Oct 30, 2019Updated 6 years ago
- MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering☆101Mar 30, 2023Updated 3 years ago
- Experiment task scheduling made easy.☆33Updated this week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Statistics and Visualization of acceptance rate, main keyword of NeurIPS 2020 accepted papers☆16Oct 12, 2020Updated 5 years ago
- A pytorch implemetation of data augmentation method for visual question answering☆21May 25, 2023Updated 3 years ago
- A comparison of human attention with computational attention mechanisms☆12Jul 3, 2020Updated 5 years ago
- An awesome YAML-based CV that works with your existing Jekyll site☆11Jan 10, 2019Updated 7 years ago
- Code for COLING 2022 paper: Modeling Intra- and Inter-Modal Relations: Hierarchical Graph Contrastive Learning for Multimodal Sentiment A…☆11May 28, 2023Updated 3 years ago
- [CVPR'24 Highlight] Implementation of "Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models"☆17Sep 12, 2024Updated last year
- [ICCV 2023 Workshop] The Official Implementation of The First Prize Solution for RVOS Competition☆14Jan 1, 2024Updated 2 years ago
- Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"☆24Nov 1, 2025Updated 7 months ago
- Weakly Supervised Grounding for VQA in Vision-Language Transformers☆16May 6, 2023Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"☆35Dec 5, 2022Updated 3 years ago
- ☆10Nov 29, 2024Updated last year
- [ICML 2026] The offical code of Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis☆84Jun 2, 2026Updated 2 weeks ago
- SOTA work about out-of-distribution detection☆14Mar 5, 2021Updated 5 years ago
- Python3 version of UC Berkeley's CS 188 Pacman Capture the Flag project☆10Mar 14, 2024Updated 2 years ago
- The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be…☆19Jan 23, 2018Updated 8 years ago
- MLPs for Vision and Langauge Modeling (Coming Soon)☆27Dec 9, 2021Updated 4 years ago
- ☆12Aug 14, 2019Updated 6 years ago
- Official PyTorch implementation of the paper "DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training".☆59Aug 2, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆10Nov 23, 2023Updated 2 years ago
- ☆10Oct 14, 2020Updated 5 years ago
- [CVPR 2019] Official Matlab implementation of OSD: Unsupervised image matching and object discovery as optimization.☆12Nov 4, 2021Updated 4 years ago
- 猛虎汽车故障云诊断系统☆13Dec 12, 2014Updated 11 years ago
- Binary Sentiment Analysis on Amazon Reviews by fine tuning pre trained XLNet☆14May 4, 2020Updated 6 years ago
- A human-annotated, fine-grained dataset for Vision-and-Language Navigation☆17Jan 20, 2022Updated 4 years ago
- DeVLBert: Learning Deconfounded Visio-Linguistic Representations☆27Nov 27, 2022Updated 3 years ago