sayakpaul/caption-upsampling

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sayakpaul/caption-upsampling)

sayakpaul / caption-upsampling

This repository implements the idea of "caption upsampling" from DALL-E 3 with Zephyr-7B and gathers results with SDXL.

☆158

Alternatives and similar repositories for caption-upsampling

Users that are interested in caption-upsampling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tfernd / hyper-merge
View on GitHub
☆54Sep 11, 2023Updated 2 years ago
mkshing / ziplora-pytorch
View on GitHub
Implementation of "ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs"
☆564Dec 27, 2023Updated 2 years ago
SkalskiP / SoM
View on GitHub
Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️
☆87Oct 20, 2023Updated 2 years ago
mkshing / svdiff-pytorch
View on GitHub
Implementation of "SVDiff: Compact Parameter Space for Diffusion Fine-Tuning"
☆386Jan 24, 2024Updated 2 years ago
nateraw / spaces-docker-templates
View on GitHub
🚀🤗 A collection of templates for Hugging Face Spaces
☆35Oct 9, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
CarperAI / DRLX
View on GitHub
Diffusion Reinforcement Learning Library
☆195Feb 13, 2024Updated 2 years ago
ZCode-opensource / image-artisan-xl
View on GitHub
Image Artisan XL is the ultimate desktop application for creating amazing images with the power of artificial intelligence.
☆18Apr 25, 2024Updated 2 years ago
dome272 / Wuerstchen
View on GitHub
Official implementation of Würstchen: Efficient Pretraining of Text-to-Image Models
☆555Apr 6, 2024Updated 2 years ago
Jordach / CascadeTuner
View on GitHub
Implements a minimalistic version of Stable Cascade training
☆13Oct 24, 2024Updated last year
data2ml / all-clip
View on GitHub
Load any clip model with a standardized interface
☆22Oct 20, 2025Updated 9 months ago
Birch-san / imagebind-guided-diffusion
View on GitHub
Guide diffusion on ImageBind embedding similarity
☆29May 27, 2023Updated 3 years ago
chengzeyi / stable-fast
View on GitHub
https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
☆1,304Mar 27, 2025Updated last year
jfischoff / svd-inpainting
View on GitHub
An attempt at a SVD inpainting pipeline
☆50Dec 24, 2023Updated 2 years ago
crowsonkb / latent-diffusion
View on GitHub
High-Resolution Image Synthesis with Latent Diffusion Models
☆60Apr 17, 2022Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
rohitgandikota / sliders
View on GitHub
Concept Sliders for Precise Control of Diffusion Models
☆1,136Apr 13, 2026Updated 3 months ago
sayakpaul / diffusers-torchao
View on GitHub
End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).
☆399Jan 8, 2026Updated 6 months ago
sayakpaul / GCP-ML-API-Demos
View on GitHub
Contains Colab Notebooks show cool use-cases of different GCP ML APIs.
☆10Nov 5, 2020Updated 5 years ago
cloneofsimo / minRF
View on GitHub
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
☆640Jul 1, 2024Updated 2 years ago
baaivision / MUSE-Pytorch
View on GitHub
An in-context conditioning version of MUSE with pre-trained checkpoints.
☆115Jun 4, 2023Updated 3 years ago
huggingface / controlnet_aux
View on GitHub
☆494May 8, 2025Updated last year
sayakpaul / single-video-curation-svd
View on GitHub
Educational repository for applying the main video data curation techniques presented in the Stable Video Diffusion paper.
☆81Dec 30, 2023Updated 2 years ago
Birch-san / sdxl-diffusion-decoder
View on GitHub
Let's try and finetune the OpenAI consistency decoder to work for SDXL
☆25Dec 3, 2023Updated 2 years ago
tfernd / HyperTile
View on GitHub
☆119Nov 12, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
openai / consistencydecoder
View on GitHub
Consistency Distilled Diff VAE
☆2,213Nov 7, 2023Updated 2 years ago
FriedRonaldo / Primitives-PS
View on GitHub
Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementati…
☆34Nov 14, 2022Updated 3 years ago
huggingface / diffusion-fast
View on GitHub
Faster generation with text-to-image diffusion models.
☆234Jun 28, 2025Updated last year
Picsart-AI-Research / PAIR-Diffusion
View on GitHub
[CVPR 2024] PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor
☆521Apr 2, 2024Updated 2 years ago
cloneofsimo / minSDXL
View on GitHub
Huggingface-compatible SDXL Unet implementation that is readily hackable
☆439Aug 9, 2023Updated 2 years ago
sayakpaul / big_vision_experiments
View on GitHub
Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.
☆22Jan 16, 2023Updated 3 years ago
abyildirim / inst-inpaint
View on GitHub
A novel inpainting framework that can remove objects from images based on the instructions given as text prompts.
☆386Dec 9, 2025Updated 7 months ago
kfirgoldberg / ConceptLab
View on GitHub
Official Implementation for "ConceptLab: Creative Generation using Diffusion Prior Constraints"
☆256Dec 19, 2023Updated 2 years ago
bentoml / IF-multi-GPUs-demo
View on GitHub
☆12Jul 5, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
bghira / SimpleTuner
View on GitHub
A general fine-tuning kit geared toward image/video/audio diffusion models.
☆2,880Updated this week
google / style-aligned
View on GitHub
Official code for "Style Aligned Image Generation via Shared Attention"
☆1,315Dec 29, 2023Updated 2 years ago
Quasimondo / ComfyUI-QuasimondoNodes
View on GitHub
A collection of various custom nodes for ComfyUI (Work in progress)
☆14Jun 9, 2025Updated last year
microsoft / Interactive-Summarization
View on GitHub
The official repo of our research work "Interactive Editing for Text Summarization".
☆23Jun 3, 2023Updated 3 years ago
segmind / distill-sd
View on GitHub
Segmind Distilled diffusion
☆618Oct 18, 2023Updated 2 years ago
cloneofsimo / auto_llm_codebase_analysis
View on GitHub
☆27Mar 14, 2024Updated 2 years ago
google / RB-Modulation
View on GitHub
Official code for "RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control"
☆404Mar 19, 2025Updated last year