An in-context conditioning version of MUSE with pre-trained checkpoints.
☆116Jun 4, 2023Updated 2 years ago
Alternatives and similar repositories for MUSE-Pytorch
Users that are interested in MUSE-Pytorch are comparing it to the libraries listed below
Sorting:
- Open reproduction of MUSE for fast text2image generation.☆359Jun 1, 2024Updated last year
- Unoffical implement for [StyleDrop](https://arxiv.org/abs/2306.00983)☆585Aug 23, 2023Updated 2 years ago
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆213Feb 27, 2024Updated 2 years ago
- Paper List for In-context Learning 🌷☆20Jan 3, 2023Updated 3 years ago
- This is an unofficial PyTorch implementation of StyleDrop: Text-to-Image Generation in Any Style.☆226Jul 11, 2023Updated 2 years ago
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆85Nov 2, 2022Updated 3 years ago
- ☆88Jan 4, 2024Updated 2 years ago
- A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".☆1,096Mar 25, 2023Updated 2 years ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- [NeurIPS 2022] code for "Visual Concepts Tokenization"☆23Oct 10, 2022Updated 3 years ago
- Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)☆92Mar 16, 2023Updated 2 years ago
- recipe for training fully-featured self supervised image jepa models☆12Jun 4, 2025Updated 8 months ago
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 2 years ago
- [CVPR'23] Video Probabilistic Diffusion Models in Projected Latent Space☆324May 14, 2024Updated last year
- code release of research paper "Exploring Long-Sequence Masked Autoencoders"☆100Oct 14, 2022Updated 3 years ago
- Exploiting unlabeled data with vision and language models for object detection, ECCV 2022☆94Jan 16, 2024Updated 2 years ago
- Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models☆357Jul 4, 2023Updated 2 years ago
- Use miniGPT-4 batch to generate captions for a lot of images! You should be able to create the best captions you always wanted!☆18Jul 20, 2023Updated 2 years ago
- The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021](https://ar…☆20Dec 7, 2021Updated 4 years ago
- [CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)☆44Sep 5, 2023Updated 2 years ago
- Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"☆1,096Dec 22, 2025Updated 2 months ago
- Test-Time Training on Video Streams☆67Jul 24, 2023Updated 2 years ago
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆289Jan 14, 2024Updated 2 years ago
- Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"☆1,473May 31, 2023Updated 2 years ago
- MoVQGAN - model for the image encoding and reconstruction☆260Oct 31, 2023Updated 2 years ago
- ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation (ICCV 2023, Oral)☆543Jan 8, 2024Updated 2 years ago
- Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"☆413Mar 25, 2024Updated last year
- [IJCV'24] AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort☆149Nov 23, 2024Updated last year
- Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models☆313Dec 28, 2023Updated 2 years ago
- ☆64Jul 1, 2023Updated 2 years ago
- ☆180Nov 14, 2025Updated 3 months ago
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- Official Jax Implementation of MaskGIT☆554Nov 18, 2022Updated 3 years ago
- Scaling Diffusion Transformers with Mixture of Experts☆417Sep 9, 2024Updated last year
- Versatile Diffusion: Text, Images and Variations All in One Diffusion Model, arXiv 2022 / ICCV 2023☆1,336Aug 10, 2023Updated 2 years ago
- Replication of Pix2Seq with Pretrained Model☆59Nov 6, 2021Updated 4 years ago
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 2 years ago
- Some thoughts about writing scientific papers☆21Nov 8, 2024Updated last year
- [ECCV2022] This is an official implementation of paper "RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentati…☆78Feb 12, 2023Updated 3 years ago