gaopengcuhk / Stable-Pix2Seq
A full-fledged version of Pix2Seq
☆235Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for Stable-Pix2Seq
- Replication of Pix2Seq with Pretrained Model☆60Updated 3 years ago
- Unofficial implementation of Pix2SEQ☆165Updated 3 years ago
- [Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)☆208Updated 2 years ago
- [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning☆173Updated 2 years ago
- A new framework for open-vocabulary object detection, based on maskrcnn-benchmark☆226Updated last year
- ☆168Updated 3 years ago
- PromptDet: Towards Open-vocabulary Detection using Uncurated Images, ECCV2022☆160Updated 2 years ago
- ☆123Updated 2 years ago
- [ICCV 2023] You Only Look at One Partial Sequence☆336Updated last year
- ☆173Updated 2 years ago
- SeqTR: A Simple yet Universal Network for Visual Grounding☆130Updated last week
- [CVPR 2021] Instance Localization for Self-supervised Detection Pretraining☆144Updated 3 years ago
- [CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning☆284Updated 2 years ago
- [ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)☆182Updated 7 months ago
- [CVPR 2023] implementation of Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.☆91Updated last year
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆84Updated 2 years ago
- ☆77Updated 2 years ago
- PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529☆158Updated 2 years ago
- [ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design☆192Updated 11 months ago
- Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"☆238Updated last year
- [CVPR 2022 Oral] AdaMixer: A Fast-Converging Query-Based Object Detector☆236Updated 2 years ago
- ☆267Updated last year
- This repository is an official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence". (https://arxiv.org…☆367Updated last year
- [CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection☆174Updated last year
- MixMIM: Mixed and Masked Image Modeling for Efficient Visual Representation Learning☆128Updated last year
- Open-vocabulary Semantic Segmentation☆166Updated last year
- PyTorch implementation of the paper "MILAN: Masked Image Pretraining on Language Assisted Representation" https://arxiv.org/pdf/2208.0604…☆79Updated 2 years ago
- ☆183Updated last year
- Dense Distinct Query for End-to-End Object Detection (CVPR2023)☆244Updated last year
- Exploiting unlabeled data with vision and language models for object detection, ECCV 2022☆86Updated 9 months ago