JJJYmmm / Pix2SeqV2-PytorchLinks
Simple Implementation of Pix2seqV2(multi-task)
☆23Updated 7 months ago
Alternatives and similar repositories for Pix2SeqV2-Pytorch
Users that are interested in Pix2SeqV2-Pytorch are comparing it to the libraries listed below
Sorting:
- DVIS: Decoupled Video Instance Segmentation Framework☆152Updated last year
- ☆120Updated last year
- Recognize Any Regions☆122Updated 7 months ago
- [ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting☆110Updated last year
- (NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection☆117Updated last year
- [CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection☆177Updated 3 months ago
- ☆106Updated 2 years ago
- [AAAI 2025] AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video…☆85Updated 6 months ago
- ☆143Updated last year
- ICCV'2023 | CTVIS: Consistent Training for Online Video Instance Segmentation☆78Updated last year
- ☆70Updated 2 months ago
- [AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding☆56Updated 2 years ago
- [CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution☆51Updated 4 months ago
- Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".☆158Updated 7 months ago
- 「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation☆81Updated last month
- [ICCV 2023] Official implementation of the paper "Detection Transformer with Stable Matching"☆229Updated last year
- [ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction☆190Updated last year
- [ECCV 2024] Official implementation of "LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction"☆80Updated 3 months ago
- OvarNet official implement of the paper "OvarNet: Towards Open-vocabulary Object Attribute Recognition"☆105Updated 2 years ago
- InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024☆81Updated last year
- [TCSVT] state-of-the-art open vocabulary detector on COCO/LVIS/V3Det☆30Updated last month
- Official Implementation for CVPR 2024 paper: CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor☆108Updated last year
- Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs☆91Updated 6 months ago
- ☆86Updated last year
- [ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design☆198Updated last year
- [CVPR 2023] Official implementation of "SAP-DETR: Bridging the Gap between Salient Points and Queries-Based Transformer Detector for Fast…☆30Updated 2 years ago
- Code release for "Weakly Supervised Open-Vocabulary Object Detection", AAAI2024☆35Updated 10 months ago
- [ICCV 2023] The official PyTorch code for Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation☆89Updated last year
- A DETR-style framework for open-vocabulary detection (OVD). CVPR 2023☆197Updated 2 years ago
- (ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation☆47Updated last year