Yuqifan1117 / Labal-Anything-Pipeline
Baby-DALL3: Annotation anything in visual tasks and Generate anything just all in one-pipeline with GPT-4 (a small baby of DALL·E 3).
☆82Updated 11 months ago
Related projects: ⓘ
- [NeurIPS2022] This is the official implementation of the paper "Expediting Large-Scale Vision Transformer for Dense Prediction without Fi…☆81Updated 10 months ago
- Official Implementation of ICCV 2023 Paper - SegPrompt: Boosting Open-World Segmentation via Category-level Prompt Learning☆110Updated 3 weeks ago
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆88Updated 2 months ago
- ☆60Updated last year
- ☆63Updated 9 months ago
- InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024☆68Updated 5 months ago
- Recognize Any Regions☆115Updated 9 months ago
- ☆93Updated 3 months ago
- [CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloadin…☆178Updated 3 weeks ago
- ☆158Updated last year
- [CVPR 2023] implementation of Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.☆91Updated last year
- [ICCV'23] Cascade-DETR: Delving into High-Quality Universal Object Detection☆93Updated last year
- [NeurIPS 2023] FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models☆128Updated 9 months ago
- ☆75Updated last year
- Connecting segment-anything's output masks with the CLIP model; Awesome-Segment-Anything-Works☆169Updated 9 months ago
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆84Updated last year
- ☆143Updated this week
- ☆123Updated 8 months ago
- Official Implementation for CVPR 2024 paper: CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor☆92Updated 2 months ago
- [CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"☆163Updated 3 months ago
- Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models☆181Updated 8 months ago
- (NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection☆104Updated 4 months ago
- PromptDet: Towards Open-vocabulary Detection using Uncurated Images, ECCV2022☆159Updated 2 years ago
- [CVPR 2023 Highlight] Freestyle Layout-to-Image Synthesis☆143Updated last year
- Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.☆168Updated 2 months ago
- Simple Implementation of Pix2Seq model for object detection in PyTorch☆115Updated last year
- [IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation☆107Updated last month
- ☆87Updated 2 months ago
- OvarNet official implement of the paper "OvarNet: Towards Open-vocabulary Object Attribute Recognition"☆98Updated last year
- Zero-label image classification via OpenCLIP knowledge distillation☆104Updated last year