gaopengcuhk / Stable-Pix2Seq
A full-fledged version of Pix2Seq
☆238Updated 3 years ago
Alternatives and similar repositories for Stable-Pix2Seq:
Users that are interested in Stable-Pix2Seq are comparing it to the libraries listed below
- Unofficial implementation of Pix2SEQ☆165Updated 3 years ago
- Replication of Pix2Seq with Pretrained Model☆60Updated 3 years ago
- [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning☆176Updated 3 years ago
- [ICCV 2023] You Only Look at One Partial Sequence☆340Updated last year
- PromptDet: Towards Open-vocabulary Detection using Uncurated Images, ECCV2022☆166Updated 2 years ago
- ☆275Updated 2 years ago
- ☆180Updated 2 years ago
- [Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)☆226Updated 2 years ago
- Code of ICCV paper: https://arxiv.org/abs/2011.10881☆76Updated 2 years ago
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆84Updated 2 years ago
- ☆171Updated 3 years ago
- [CVPR 2023] implementation of Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.☆91Updated last year
- [ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)☆186Updated last year
- "SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.☆198Updated 3 years ago
- [CVPR 2022 Oral] AdaMixer: A Fast-Converging Query-Based Object Detector☆233Updated 2 years ago
- SeqTR: A Simple yet Universal Network for Visual Grounding☆135Updated 6 months ago
- [CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning☆208Updated 2 years ago
- This repository is an official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence". (https://arxiv.org…☆373Updated last year
- PyTorch implementation of the paper "MILAN: Masked Image Pretraining on Language Assisted Representation" https://arxiv.org/pdf/2208.0604…☆82Updated 2 years ago
- ☆257Updated 2 years ago
- A new framework for open-vocabulary object detection, based on maskrcnn-benchmark☆238Updated 2 years ago
- [CVPR 2021] Instance Localization for Self-supervised Detection Pretraining☆145Updated 3 years ago
- This is an implementation of Deformable-DETR☆49Updated 4 years ago
- [ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design☆197Updated last year
- Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"☆242Updated 2 years ago
- Open-vocabulary Semantic Segmentation☆173Updated 2 years ago
- Dense Distinct Query for End-to-End Object Detection (CVPR2023)☆253Updated last year
- [CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning☆286Updated 2 years ago
- ☆81Updated 2 years ago
- [AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding☆56Updated 2 years ago