A task-agnostic vision-language architecture as a step towards General Purpose Vision
☆92Jul 14, 2021Updated 4 years ago
Alternatives and similar repositories for gpv-1
Users that are interested in gpv-1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆32Mar 7, 2022Updated 4 years ago
- ☆14Dec 25, 2020Updated 5 years ago
- ☆1,047Oct 3, 2022Updated 3 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆117Sep 15, 2022Updated 3 years ago
- PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)☆374Jul 29, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago
- Official code of *Towards Event-oriented Long Video Understanding*☆12Jul 26, 2024Updated last year
- ☆11Sep 7, 2020Updated 5 years ago
- ☆38Aug 25, 2022Updated 3 years ago
- project page for VinVL☆359Jul 26, 2023Updated 2 years ago
- Omnivore: A Single Model for Many Visual Modalities☆572Nov 12, 2022Updated 3 years ago
- Extract features and bounding boxes using the original Bottom-up Attention Faster-RCNN in a few lines of Python code☆11Sep 18, 2022Updated 3 years ago
- UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)☆90Jun 12, 2023Updated 2 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A new framework for open-vocabulary object detection, based on maskrcnn-benchmark☆248Feb 11, 2023Updated 3 years ago
- General-purpose Visual Understanding Evaluation☆20Dec 21, 2023Updated 2 years ago
- A simple wrapper for lmdb. Support dict-like operations.☆23Apr 20, 2023Updated 2 years ago
- [ECCV2022] Rethinking Data Augmentation for Robust Visual Question Answering☆13Nov 23, 2022Updated 3 years ago
- Official code release for paper "Improving Confidence Estimates for Unfamiliar Examples" https://arxiv.org/abs/1804.03166☆12Aug 16, 2020Updated 5 years ago
- ☆35May 2, 2022Updated 3 years ago
- ☆10Jul 23, 2021Updated 4 years ago
- [ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"☆100Apr 4, 2023Updated 2 years ago
- [ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations☆15Oct 28, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Official repository for the General Robust Image Task (GRIT) Benchmark☆54Mar 29, 2023Updated 2 years ago
- ☆231Dec 18, 2023Updated 2 years ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 3 years ago
- ☆43Aug 9, 2022Updated 3 years ago
- Use CLIP to represent video for Retrieval Task☆70Mar 1, 2021Updated 5 years ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆27Aug 20, 2022Updated 3 years ago
- Caffe re-implementation of dynamic network surgery.☆18Jun 15, 2018Updated 7 years ago
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm☆677Sep 19, 2022Updated 3 years ago
- The Intermediate Goal of the project is to train a GPT like architecture to learn to summarise reddit posts from human preferences, as th…☆12Jul 14, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Shows visual grounding methods can be right for the wrong reasons! (ACL 2020)☆23Jun 26, 2020Updated 5 years ago
- ☆289Aug 14, 2025Updated 7 months ago
- [ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"☆54Oct 19, 2022Updated 3 years ago
- ☆87Mar 4, 2024Updated 2 years ago
- Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020☆43Apr 26, 2020Updated 5 years ago
- Pytorch implementation for our NeurIPS 2019 paper "TAB-VCR: Tags and Attributes based VCR Baselines" https://arxiv.org/abs/1910.14671☆19May 6, 2021Updated 4 years ago
- Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))☆56Feb 6, 2023Updated 3 years ago