allenai/gpv-1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/allenai/gpv-1)

allenai / gpv-1

A task-agnostic vision-language architecture as a step towards General Purpose Vision

☆92

Alternatives and similar repositories for gpv-1

Users that are interested in gpv-1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

allenai / gpv2
View on GitHub
☆32Mar 7, 2022Updated 4 years ago
AlenUbuntu / Awesome-Vision-and-Language-PreTrain-Papers
View on GitHub
☆14Dec 25, 2020Updated 5 years ago
ashkamath / mdetr
View on GitHub
☆1,051Oct 3, 2022Updated 3 years ago
MikeWangWZHL / VidIL
View on GitHub
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆117Sep 15, 2022Updated 3 years ago
j-min / VL-T5
View on GitHub
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
☆372Jul 29, 2023Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ylsung / VL_adapter
View on GitHub
PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)
☆212Dec 18, 2022Updated 3 years ago
ruotianluo / refexp-comprehension
View on GitHub
Referring expression comprehension on ReferIt(RefClef)
☆10Nov 28, 2016Updated 9 years ago
RUCAIBox / Event-Bench
View on GitHub
Official code of *Towards Event-oriented Long Video Understanding*
☆12Jul 26, 2024Updated 2 years ago
princetonvisualai / SPICE-U
View on GitHub
☆11Sep 7, 2020Updated 5 years ago
mengqiDyangge / HierKD
View on GitHub
☆39Aug 25, 2022Updated 3 years ago
pzzhang / VinVL
View on GitHub
project page for VinVL
☆360Jul 26, 2023Updated 3 years ago
facebookresearch / omnivore
View on GitHub
Omnivore: A Single Model for Many Visual Modalities
☆573Nov 12, 2022Updated 3 years ago
michelecafagna26 / faster-rcnn-bottom-up-py
View on GitHub
Extract features and bounding boxes using the original Bottom-up Attention Faster-RCNN in a few lines of Python code
☆11Sep 18, 2022Updated 3 years ago
microsoft / UniTAB
View on GitHub
UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)
☆90Jun 12, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zinengtang / DeCEMBERT
View on GitHub
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Jan 12, 2023Updated 3 years ago
alirezazareian / ovr-cnn
View on GitHub
A new framework for open-vocabulary object detection, based on maskrcnn-benchmark
☆249Feb 11, 2023Updated 3 years ago
wllmzhu / G-VUE
View on GitHub
General-purpose Visual Understanding Evaluation
☆20Dec 21, 2023Updated 2 years ago
ruotianluo / lmdbdict
View on GitHub
A simple wrapper for lmdb. Support dict-like operations.
☆23Apr 20, 2023Updated 3 years ago
ItemZheng / KDDAug
View on GitHub
[ECCV2022] Rethinking Data Augmentation for Robust Visual Question Answering
☆13Nov 23, 2022Updated 3 years ago
achalddave / large-vocab-devil
View on GitHub
☆35May 2, 2022Updated 4 years ago
YiwuZhong / SGG_from_NLS
View on GitHub
[ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"
☆100Apr 4, 2023Updated 3 years ago
maryamziaa / ConceptBERT
View on GitHub
☆10Jul 23, 2021Updated 5 years ago
ys-zong / FoolyourVLLMs
View on GitHub
[ICML 2024] Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
☆15Oct 28, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
allenai / unified-io-inference
View on GitHub
☆231Dec 18, 2023Updated 2 years ago
Victorwz / VaLM
View on GitHub
VaLM: Visually-augmented Language Modeling. ICLR 2023.
☆56Mar 6, 2023Updated 3 years ago
allenai / grit_official
View on GitHub
Official repository for the General Robust Image Task (GRIT) Benchmark
☆56Mar 29, 2023Updated 3 years ago
Deferf / CLIP_Video_Representation
View on GitHub
Use CLIP to represent video for Retrieval Task
☆71Mar 1, 2021Updated 5 years ago
princeton-vl / SOLID
View on GitHub
☆43Aug 9, 2022Updated 3 years ago
jayleicn / mTVRetrieval
View on GitHub
[ACL 2021] mTVR: Multilingual Video Moment Retrieval
☆27Aug 20, 2022Updated 3 years ago
HolmesShuan / Dynamic-Network-Surgery-Caffe-Reimplementation
View on GitHub
Caffe re-implementation of dynamic network surgery.
☆18Jun 15, 2018Updated 8 years ago
Sense-GVT / DeCLIP
View on GitHub
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
☆677Sep 19, 2022Updated 3 years ago
EleutherAGI / summarisation
View on GitHub
The Intermediate Goal of the project is to train a GPT like architecture to learn to summarise reddit posts from human preferences, as th…
☆12Jul 14, 2021Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
erobic / negative_analysis_of_grounding
View on GitHub
Shows visual grounding methods can be right for the wrong reasons! (ACL 2020)
☆23Jun 26, 2020Updated 6 years ago
fundamentalvision / Uni-Perceiver
View on GitHub
☆291Aug 14, 2025Updated 11 months ago
xrenaa / Retriever
View on GitHub
[ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"
☆54Oct 19, 2022Updated 3 years ago
chaoyuaw / lvu
View on GitHub
☆87Mar 4, 2024Updated 2 years ago
gsig / visual-grounding
View on GitHub
Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020
☆43Apr 26, 2020Updated 6 years ago
e-bug / iglue
View on GitHub
[ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"
☆49Dec 7, 2022Updated 3 years ago
MILVLG / rosita
View on GitHub
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
☆57Jun 13, 2023Updated 3 years ago