THUDM / CogView
Text-to-Image generation. The repo for NeurIPS 2021 paper "CogView: Mastering Text-to-Image Generation via Transformers".
☆1,766Updated last year
Alternatives and similar repositories for CogView:
Users that are interested in CogView are comparing it to the libraries listed below
- official code repo for paper "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"☆948Updated 2 years ago
- Taming Transformers for High-Resolution Image Synthesis☆6,049Updated 7 months ago
- Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"☆1,401Updated last year
- ☆1,022Updated 5 months ago
- Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch☆547Updated 2 years ago
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆5,605Updated last year
- PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation☆5,086Updated 7 months ago
- Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence L…☆2,474Updated 10 months ago
- ☆3,241Updated 9 months ago
- GLM (General Language Model)☆3,228Updated last year
- ☆1,463Updated last year
- Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.☆3,938Updated 7 months ago
- Official implementation of VQ-Diffusion☆916Updated 10 months ago
- Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch☆765Updated 7 months ago
- CLIP+MLP Aesthetic Score Predictor☆1,006Updated 8 months ago
- Versatile Diffusion: Text, Images and Variations All in One Diffusion Model, arXiv 2022 / ICCV 2023☆1,328Updated last year
- ☆1,171Updated 2 years ago
- Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。☆4,086Updated 6 months ago
- Official Implementation for "Pivotal Tuning for Latent-based editing of Real Images" (ACM TOG 2022) https://arxiv.org/abs/2106.05744☆918Updated 7 months ago
- Open-Set Grounded Text-to-Image Generation☆2,090Updated last year
- Easily compute clip embeddings and build a clip retrieval system with them☆2,504Updated 10 months ago
- Consistency Distilled Diff VAE☆2,159Updated last year
- Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)☆4,061Updated last year
- Using Low-rank adaptation to quickly fine-tune diffusion models.☆7,236Updated 11 months ago
- OpenAI CLIP text encoders for multiple languages!☆785Updated last year
- Simple image captioning model☆1,348Updated 9 months ago
- A concise but complete implementation of CLIP with various experimental improvements from recent papers☆708Updated last year
- ☆1,558Updated 2 years ago
- Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch☆1,234Updated 2 years ago
- An open source implementation of CLIP.☆11,177Updated last week