[CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering
☆22Sep 21, 2024Updated last year
Alternatives and similar repositories for Multi-Agent-VQA
Users that are interested in Multi-Agent-VQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [WACV 2025] Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge☆40Oct 29, 2024Updated last year
- Export VMamba to onnx. VMamba: Visual State Space Models,code is based on VMamba: https://github.com/MzeroMiko/VMamba☆22May 13, 2025Updated 11 months ago
- Codes and scripts for "Explainable Semantic Space by Grounding Languageto Vision with Cross-Modal Contrastive Learning"☆20Mar 23, 2022Updated 4 years ago
- ☆14May 6, 2024Updated 2 years ago
- ☆17Dec 13, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆13Mar 14, 2025Updated last year
- [CVPR'2022 Oral] The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation☆33Oct 19, 2023Updated 2 years ago
- VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)☆45Mar 28, 2024Updated 2 years ago
- This repository has code of how to train a RNN that can perform overtaking in F1TENTH simulator as well as a dataset I have created☆15Nov 7, 2023Updated 2 years ago
- Weakly-Supervised Cell Tracking via Backward-and-Forward Propagation, in ECCV 2020☆11Aug 4, 2020Updated 5 years ago
- [NeurIPS 2023] LMC: Large Model Collaboration with Cross-assessment for Training-Free Open-Set Object Recognition☆19May 26, 2024Updated last year
- A real-time video understanding foundation model built on Llama-3.2-Vision, featuring comprehensively extended video processing and multi…☆138Apr 13, 2026Updated 3 weeks ago
- Code for Greedy Gradient Ensemble for Visual Question Answering (ICCV 2021, Oral)☆27Mar 28, 2022Updated 4 years ago
- CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation☆36Jan 29, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Estimate dataset difficulty and detect label mistakes using reconstruction error ratios!☆28Jan 10, 2025Updated last year
- Counterfactual Reasoning VQA Dataset☆28Nov 23, 2023Updated 2 years ago
- The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be…☆19Jan 23, 2018Updated 8 years ago
- This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World"…☆49Mar 12, 2024Updated 2 years ago
- Codebase for AAAI 2024 conference paper Visual Chain-of-Thought Prompting for Knowledge-based Visual Reasoning☆39Mar 12, 2025Updated last year
- End-to-End Learning of Behavioural Inputs for Autonomous Driving in Dense Traffic☆22Oct 26, 2023Updated 2 years ago
- Implementation of ResiDualGAN and DRDG☆14Apr 15, 2024Updated 2 years ago
- Official implementation of TagAlign☆37Dec 11, 2024Updated last year
- Python scripts for tracking cells in fluorescent microscopy.☆11Dec 10, 2017Updated 8 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Coarse-to-Fine Reasoning for Visual Question Answering (CVPRW'22)☆49Apr 22, 2026Updated 2 weeks ago
- Official implementation of "MedITok: A Unified Tokenizer for Medical Image Synthesis and Interpretation"☆28Apr 3, 2026Updated last month
- ☆11Dec 20, 2024Updated last year
- A helper allows you to manage your deep learning model‘s parameters in a convenient way.☆11Nov 25, 2020Updated 5 years ago
- PyTorch implementation of the paper "SuperLoss: A Generic Loss for Robust Curriculum Learning" in NIPS 2020.☆29Jan 26, 2021Updated 5 years ago
- [CVPR 2025] COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training☆41Mar 27, 2025Updated last year
- ☆10Aug 22, 2020Updated 5 years ago
- Explaining Autonomous Driving Actions with Visual Question Answering (IEEE ITSC-2023)☆19Feb 15, 2024Updated 2 years ago
- ☆20Oct 22, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆18Oct 31, 2024Updated last year
- This code was submitted to Cell Tracking Challenge, ISBI 2020.☆14May 19, 2021Updated 4 years ago
- Code for paper "Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation"☆40Apr 8, 2026Updated last month
- Code for ACL22 short Paper "Hierarchical Curriculum Learning for AMR Parsing"☆13Jun 1, 2022Updated 3 years ago
- ☆12Mar 8, 2021Updated 5 years ago
- A Deep Learning-Based Smartphone App for Real-Time Detection of Retinal Abnormalities in Fundus Images☆11Mar 11, 2020Updated 6 years ago
- Visualization of the PCA as shown in Figure 1.☆45Jan 14, 2024Updated 2 years ago