Baby-DALL3: Annotation anything in visual tasks and Generate anything just all in one-pipeline with GPT-4 (a small baby of DALL·E 3).
☆85Sep 21, 2023Updated 2 years ago
Alternatives and similar repositories for Labal-Anything-Pipeline
Users that are interested in Labal-Anything-Pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo contains the code for our paper Compositor: Bottom-Up Clustering and Compositing for Robust Part and Object Segmentation☆18Mar 20, 2025Updated last year
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆52Jul 16, 2024Updated last year
- ☆10Jul 5, 2024Updated last year
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 3 years ago
- ☆87May 8, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ICCV DeeperAction Challenge - Kinetics-TPS Challenge on Part-level Action Parsing and Action Recognition.☆14Jun 4, 2021Updated 5 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- Official code of *Towards Event-oriented Long Video Understanding*☆12Jul 26, 2024Updated last year
- A collection of resources and papers on diffusion models of video generation.☆10Feb 11, 2023Updated 3 years ago
- Entry to the 2023 Scroll Prize☆42Apr 15, 2023Updated 3 years ago
- Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"☆27Jul 10, 2023Updated 2 years ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆29Jan 23, 2024Updated 2 years ago
- This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World"…