gaopengcuhk / Stable-Pix2Seq
A full-fledged version of Pix2Seq
☆238Updated 3 years ago
Alternatives and similar repositories for Stable-Pix2Seq:
Users that are interested in Stable-Pix2Seq are comparing it to the libraries listed below
- Replication of Pix2Seq with Pretrained Model☆60Updated 3 years ago
- Unofficial implementation of Pix2SEQ☆165Updated 3 years ago
- [Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)☆221Updated 2 years ago
- [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning☆176Updated 3 years ago
- [ICCV 2023] You Only Look at One Partial Sequence☆340Updated last year
- ☆176Updated 2 years ago
- PromptDet: Towards Open-vocabulary Detection using Uncurated Images, ECCV2022☆164Updated 2 years ago
- A new framework for open-vocabulary object detection, based on maskrcnn-benchmark☆237Updated 2 years ago
- ☆170Updated 3 years ago
- ☆272Updated 2 years ago
- SeqTR: A Simple yet Universal Network for Visual Grounding☆132Updated 4 months ago
- Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"☆243Updated 2 years ago
- This repository is an official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence". (https://arxiv.org…☆370Updated last year
- ☆80Updated 2 years ago
- [CVPR 2023] implementation of Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.☆90Updated last year
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆84Updated 2 years ago
- This is an implementation of Deformable-DETR☆47Updated 4 years ago
- [ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)☆184Updated 11 months ago
- Open-vocabulary Semantic Segmentation☆169Updated last year
- PyTorch implementation of the paper "MILAN: Masked Image Pretraining on Language Assisted Representation" https://arxiv.org/pdf/2208.0604…☆82Updated 2 years ago
- [CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning☆286Updated 2 years ago
- [CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning☆208Updated 2 years ago
- ☆249Updated 2 years ago
- [CVPR 2021] Instance Localization for Self-supervised Detection Pretraining☆144Updated 3 years ago
- ☆191Updated 2 years ago
- [CVPR 2022 Oral] AdaMixer: A Fast-Converging Query-Based Object Detector☆234Updated 2 years ago
- "SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.☆199Updated 2 years ago
- Code for the paper "Visual Recognition by Request".☆44Updated 2 years ago
- Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.☆151Updated 3 years ago
- An official implementation of the Anchor DETR.☆347Updated 2 years ago