Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)
☆85Nov 2, 2022Updated 3 years ago
Alternatives and similar repositories for Obj2Seq
Users that are interested in Obj2Seq are comparing it to the libraries listed below
Sorting:
- [CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".☆14Sep 1, 2022Updated 3 years ago
- Unofficial implement of "Pix2seq: A Language Modeling Framework for Object Detection" on mmdetection☆33Apr 18, 2022Updated 3 years ago
- Official Implementation of DE-CondDETR and DELA-CondDETR in "Towards Data-Efficient Detection Transformers"☆45Aug 25, 2022Updated 3 years ago
- Paper List for In-context Learning 🌷☆20Jan 3, 2023Updated 3 years ago
- Replication of Pix2Seq with Pretrained Model☆59Nov 6, 2021Updated 4 years ago
- [AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding☆57Nov 28, 2022Updated 3 years ago
- [NeurIPS 2022] The official implementation of "Learning to Discover and Detect Objects".☆111Jun 13, 2023Updated 2 years ago
- Official Implementation of "FP-DETR: Detection Transformer Advanced by Fully Pre-training"☆63Mar 30, 2022Updated 3 years ago
- ☆41Sep 21, 2023Updated 2 years ago
- [Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)☆237Aug 3, 2022Updated 3 years ago
- (CVPR2023)Dense Distinct Query for End-to-End Object Detection☆264May 24, 2023Updated 2 years ago
- Unifying Visual Perception by Dispersible Points Learning (ECCV 2022)☆52Aug 19, 2022Updated 3 years ago
- Detection Transformers with Assignment☆265Sep 16, 2023Updated 2 years ago
- Open-source code for Generic Grouping Network (GGN, CVPR 2022)☆114Jan 28, 2026Updated last month
- [CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".☆276Apr 14, 2023Updated 2 years ago
- code release of research paper "Exploring Long-Sequence Masked Autoencoders"☆100Oct 14, 2022Updated 3 years ago
- ☆19Dec 6, 2023Updated 2 years ago
- UniTAB: Unifying Text and Box Outputs for Grounded VL Modeling, ECCV 2022 (Oral Presentation)☆89Jun 12, 2023Updated 2 years ago
- This repository is an official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence". (https://arxiv.org…☆400May 22, 2023Updated 2 years ago
- SeqTR: A Simple yet Universal Network for Visual Grounding☆144Oct 30, 2024Updated last year
- ☆278Dec 4, 2024Updated last year
- [ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design☆226Nov 14, 2023Updated 2 years ago
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Jun 20, 2023Updated 2 years ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆37Sep 12, 2023Updated 2 years ago
- [ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation☆103May 26, 2023Updated 2 years ago
- [Preprint 2022] “Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?” by Yi Wang, Zhiwen Fan, Tianlong Chen, Hehe Fan, Zh…☆63Jan 18, 2023Updated 3 years ago
- PyTorch Implementation of Sparse DETR☆176Jan 3, 2024Updated 2 years ago
- A self-supervised learning approach based on extremely large masking☆31Dec 19, 2022Updated 3 years ago
- Official Implementation of DE-DETR and DELA-DETR in "Towards Data-Efficient Detection Transformers"☆77Mar 10, 2024Updated last year
- This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts…☆291Feb 12, 2024Updated 2 years ago
- ☆318Oct 26, 2022Updated 3 years ago
- Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR 2023), SeqFormer(ECCV Oral), and…☆618Feb 21, 2024Updated 2 years ago
- [ICCV 2023] You Only Look at One Partial Sequence☆343Oct 21, 2023Updated 2 years ago
- ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)☆209Apr 18, 2024Updated last year
- [CVPR-2022 (oral)]-Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation☆156Aug 19, 2023Updated 2 years ago
- Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"☆245Dec 3, 2022Updated 3 years ago
- BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training☆400Oct 23, 2024Updated last year
- [CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)☆44Sep 5, 2023Updated 2 years ago
- [NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"☆379Sep 16, 2022Updated 3 years ago