microsoft / NUWA
A unified 3D Transformer Pipeline for visual synthesis
☆2,806Updated last year
Alternatives and similar repositories for NUWA:
Users that are interested in NUWA are comparing it to the libraries listed below
- Text-to-Image generation. The repo for NeurIPS 2021 paper "CogView: Mastering Text-to-Image Generation via Transformers".☆1,771Updated last year
- Taming Transformers for High-Resolution Image Synthesis☆6,068Updated 7 months ago
- official code repo for paper "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"☆948Updated 2 years ago
- Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch☆546Updated 2 years ago
- GLIDE: a diffusion-based text-conditional image synthesis model☆3,595Updated last year
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆5,605Updated last year
- Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)☆4,063Updated last year
- Search photos on Unsplash using natural language☆1,004Updated 2 years ago
- text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)☆11,019Updated this week
- 🪩 Create Disco Diffusion artworks in one line☆3,844Updated last year
- 🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022☆8,550Updated last month
- Real-Time High-Resolution Background Matting☆6,965Updated 9 months ago
- Code for Text2Human (SIGGRAPH 2022). Paper: Text2Human: Text-Driven Controllable Human Image Generation☆842Updated 8 months ago
- ☆7,467Updated last year
- ☆1,560Updated 2 years ago
- Pretrained Dalle2 from laion☆501Updated last year
- Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch☆1,958Updated 10 months ago
- PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation☆5,114Updated 7 months ago
- An open source implementation of CLIP.☆11,335Updated last week
- ☆1,024Updated 6 months ago
- [ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation☆4,313Updated last year
- [TOG 2022] SofGAN: A Portrait Image Generator with Dynamic Styling☆772Updated last year
- Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based a…☆1,592Updated last month
- [NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"☆4,533Updated 7 months ago
- VOLO: Vision Outlooker for Visual Recognition☆941Updated 2 years ago
- This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synth…☆954Updated 2 years ago
- Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence L…☆2,487Updated 11 months ago
- StyleGAN-Human: A Data-Centric Odyssey of Human Generation☆1,172Updated last month
- Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"☆1,405Updated last year
- Official repo for consistency models.☆6,289Updated last year