☆32Mar 7, 2022Updated 4 years ago
Alternatives and similar repositories for gpv2
Users that are interested in gpv2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Download Web-10K data by querying Bing Image Search☆10Feb 1, 2022Updated 4 years ago
- Repository for ACL2020 paper "Refer360° A Referring Expression Recognition Dataset in 360°Images"☆14Jun 26, 2021Updated 4 years ago
- A task-agnostic vision-language architecture as a step towards General Purpose Vision☆92Jul 14, 2021Updated 4 years ago
- Knowledge Infused Decoding☆70Dec 31, 2023Updated 2 years ago
- NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790☆27Nov 21, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- PyTorch code for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles (DANCE)☆23Nov 29, 2022Updated 3 years ago
- ☆14May 3, 2022Updated 4 years ago
- [AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding☆59Nov 28, 2022Updated 3 years ago
- ☆35Oct 21, 2023Updated 2 years ago
- [NeurIPS 2023] Official Pytorch code for LOVM: Language-Only Vision Model Selection☆21Feb 3, 2024Updated 2 years ago
- VQA baseline with Conditional Batch Normalization☆15Apr 9, 2018Updated 8 years ago
- MMBERT: Multimodal BERT Pretraining for Improved Medical VQA☆39Mar 22, 2021Updated 5 years ago
- ☆18Jun 10, 2022Updated 3 years ago
- FaVIQ: Fact Verification from Information-seeking Questions☆43Nov 23, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- C2AE architecture for multi label classification in pytorch.☆13Dec 7, 2022Updated 3 years ago
- [ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383☆418Oct 28, 2022Updated 3 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆117Sep 15, 2022Updated 3 years ago
- JSON Schema format for storing datasets details, documents processed contents, and documents annotations in the document understanding do…☆14Nov 5, 2024Updated last year
- ☆133Jul 5, 2023Updated 2 years ago
- ICLR 2022☆18Apr 15, 2022Updated 4 years ago
- ☆14May 31, 2022Updated 3 years ago
- Multi-span Style Extraction for Generative Reading Comprehension☆10Apr 2, 2021Updated 5 years ago
- OFA-Compress is a unified framework which provides OFA model finetuning, distillation and inference capabilities in Huggingface version, …☆29Sep 22, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Official repository of the paper MPMQA: Multimodal Question Answering on Product Manuals (AAAI 2023)☆19Nov 28, 2022Updated 3 years ago
- Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone☆131Oct 10, 2023Updated 2 years ago
- ☆45Jun 16, 2025Updated 11 months ago
- ☆19Jul 6, 2023Updated 2 years ago
- Code for the Ask4Help project☆22Nov 24, 2022Updated 3 years ago
- ☆15May 26, 2021Updated 4 years ago
- https://www.nlp.ecei.tohoku.ac.jp/projects/aio/☆16Aug 4, 2022Updated 3 years ago
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 3 years ago
- ☆65Jun 16, 2025Updated 11 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A Datasette instance for searching WebVid-10M☆15Sep 30, 2022Updated 3 years ago
- A large-scale infographics dataset from Visual.ly with metadata and additional crowdsourced annotations☆16Oct 8, 2018Updated 7 years ago
- [ECCV 2024] "REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models"☆13Aug 6, 2024Updated last year
- ☆11Feb 18, 2022Updated 4 years ago
- 🎹 Instruct.KR 2025 Summer Meetup: 오픈소스 LLM, vLLM으로 Production까지 🎹☆23Aug 2, 2025Updated 9 months ago
- ECCV2022 Towards Hard-Positive Query Mining for DETR-based Human-Object Interaction Detection☆27May 26, 2023Updated 2 years ago
- Code to reproduce experiments from the paper "Continual Pre-Training Mitigates Forgetting in Language and Vision" https://arxiv.org/abs/2…☆22Oct 23, 2023Updated 2 years ago