allenai / gpv-1View external linksLinks
A task-agnostic vision-language architecture as a step towards General Purpose Vision
☆92Jul 14, 2021Updated 4 years ago
Alternatives and similar repositories for gpv-1
Users that are interested in gpv-1 are comparing it to the libraries listed below
Sorting:
- ☆32Mar 7, 2022Updated 3 years ago
- ☆14Dec 25, 2020Updated 5 years ago
- ☆11Sep 7, 2020Updated 5 years ago
- PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)☆374Jul 29, 2023Updated 2 years ago
- project page for VinVL☆359Jul 26, 2023Updated 2 years ago
- ☆1,047Oct 3, 2022Updated 3 years ago
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆209Dec 18, 2022Updated 3 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆116Sep 15, 2022Updated 3 years ago
- Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020☆43Apr 26, 2020Updated 5 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- Omnivore: A Single Model for Many Visual Modalities☆571Nov 12, 2022Updated 3 years ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 2 years ago
- ☆34May 2, 2022Updated 3 years ago
- CLIP-It! Language-Guided Video Summarization☆75Jun 21, 2021Updated 4 years ago
- Caffe re-implementation of dynamic network surgery.☆18Jun 15, 2018Updated 7 years ago
- UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)☆89Jun 12, 2023Updated 2 years ago
- ☆38Aug 25, 2022Updated 3 years ago
- A new framework for open-vocabulary object detection, based on maskrcnn-benchmark☆248Feb 11, 2023Updated 3 years ago
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm☆674Sep 19, 2022Updated 3 years ago
- Official code release for paper "Improving Confidence Estimates for Unfamiliar Examples" https://arxiv.org/abs/1804.03166☆12Aug 16, 2020Updated 5 years ago
- Measure the diversity of image descriptions, repository for our COLING 2018 paper.☆13Dec 29, 2019Updated 6 years ago
- Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch☆12Jan 16, 2022Updated 4 years ago
- Use CLIP to represent video for Retrieval Task☆70Mar 1, 2021Updated 4 years ago
- ☆43Aug 9, 2022Updated 3 years ago
- [ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"☆101Apr 4, 2023Updated 2 years ago
- ☆231Dec 18, 2023Updated 2 years ago
- Official code of *Towards Event-oriented Long Video Understanding*☆12Jul 26, 2024Updated last year
- Official Pytorch Implementation for the paper 'SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients'☆17Jan 12, 2022Updated 4 years ago
- [ECCV2022] Rethinking Data Augmentation for Robust Visual Question Answering☆13Nov 23, 2022Updated 3 years ago
- A Datasette instance for searching WebVid-10M☆15Sep 30, 2022Updated 3 years ago
- In-N-Out: Towards Good Initialization for Inpainting and Outpainting (BMVC 2021)☆12Dec 15, 2021Updated 4 years ago
- Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))☆56Feb 6, 2023Updated 3 years ago
- ☆47Apr 12, 2019Updated 6 years ago
- [ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"☆54Oct 19, 2022Updated 3 years ago
- Code release for ICCV 2021 paper "Anticipative Video Transformer"☆154Feb 11, 2022Updated 4 years ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆27Aug 20, 2022Updated 3 years ago
- Pre-trained V+L Data Preparation☆46Jun 2, 2020Updated 5 years ago
- Code release for SLIP Self-supervision meets Language-Image Pre-training☆787Feb 9, 2023Updated 3 years ago