google / diffsegLinks

DiffSeg is an unsupervised zero-shot segmentation method using attention information from a stable-diffusion model. This repo implements the main DiffSeg algorithm and additionally includes an experimental feature to add semantic labels to the masks based on a generated caption.

☆328

Alternatives and similar repositories for diffseg

Users that are interested in diffseg are comparing it to the libraries listed below

Sorting:

u2seg / U2Seg
[CVPR 2024] Code release for "Unsupervised Universal Image Segmentation"
☆226Updated last year
wysoczanska / clip_dinoiser
Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.
☆263Updated last year
frank-xwang / UnSAM
[NeurIPS 2024] Code release for "Segment Anything without Supervision"
☆492Updated 2 weeks ago
Jiawei-Yang / Denoising-ViT
This is the official code release for our work, Denoising Vision Transformers.
☆388Updated last year
aliasgharkhani / SLiMe
1-shot image segmentation using Stable Diffusion
☆142Updated last year
google-research / semivl
[ECCV'24] Official Implementation of SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance
☆141Updated 6 months ago
TonyLianLong / CrossMAE
Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders
☆125Updated 7 months ago
robustsam / RobustSAM
RobustSAM: Segment Anything Robustly on Degraded Images (CVPR 2024 Highlight)
☆361Updated last year
kyegomez / Vit-RGTS
Open source implementation of "Vision Transformers Need Registers"
☆200Updated last month
ByungKwanLee / Full-Segment-Anything
This is Pytorch Implementation Code for adding new features in code of Segment-Anything. Here, the features support batch-input on the fu…
☆166Updated last year
all-things-vits / code-samples
Holds code for our CVPR'23 tutorial: All Things ViTs: Understanding and Interpreting Attention in Vision.
☆196Updated 2 years ago
berkeley-hipie / HIPIE
[NeurIPS2023] Code release for "Hierarchical Open-vocabulary Universal Image Segmentation"
☆292Updated 5 months ago
uncbiag / SimpleClick
SimpleClick: Interactive Image Segmentation with Simple Vision Transformers (ICCV 2023)
☆250Updated 3 months ago
MengyuWang826 / SegRefiner
SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete Diffusion Process
☆201Updated last year
fudan-zvg / GSS
[CVPR 2023] Official repository of Generative Semantic Segmentation
☆221Updated 2 years ago
segments-ai / latent-diffusion-segmentation
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting [ECCV 2024]
☆101Updated last year
VinAIResearch / Dataset-Diffusion
Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation (NeurIPS2023)
☆127Updated last year
CompVis / zigma
A PyTorch implementation of the paper "ZigMa: A DiT-Style Mamba-based Diffusion Model" (ECCV 2024)
☆339Updated 8 months ago
google-research / syn-rep-learn
Learning from synthetic data - code and models
☆325Updated last year
bwconrad / decoder-denoising
Pytorch reimplementation of Decoder Denoising Pretraining for Semantic Segmentation
☆51Updated 2 years ago
czg1225 / SlimSAM
[NeurIPS 2024] SlimSAM: 0.1% Data Makes Segment Anything Slim
☆346Updated 2 months ago
RockeyCoss / Prompt-Segment-Anything
This is an implementation of zero-shot instance segmentation using Segment Anything.
☆315Updated 2 years ago
sail-sg / MDT
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
☆586Updated last year
linyq2117 / SAMRefiner
[ICLR 2025] SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement
☆74Updated 7 months ago
ChenDelong1999 / subobjects
Official repository of paper "Subobject-level Image Tokenization" (ICML-25)
☆90Updated 5 months ago
Jiahao000 / MosaicFusion
[IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
☆128Updated last year
merveenoyan / siglip
Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗
☆291Updated 9 months ago
NVlabs / ODISE
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
☆929Updated last year
xmed-lab / CLIP_Surgery
[Pattern Recognition 25] CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks
☆452Updated 9 months ago
facebookresearch / RCDM
Visualizing representations with diffusion based conditional generative model.
☆103Updated 2 years ago