INFINIQ-AI1 / CLIPVQDiffusionLinks
official implementation of "CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusion model"
☆17Updated 10 months ago
Alternatives and similar repositories for CLIPVQDiffusion
Users that are interested in CLIPVQDiffusion are comparing it to the libraries listed below
Sorting:
- An official PyTorch implementation for CLIPPR☆29Updated last year
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆15Updated last month
- Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"☆15Updated 8 months ago
- ☆11Updated 5 months ago
- ORES: Open-vocabulary Responsible Visual Synthesis☆13Updated last year
- ☆10Updated 4 months ago
- Train vector quantized CLIP models using pytorch lightning☆20Updated last year
- ☆42Updated 8 months ago
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆42Updated 5 months ago
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆16Updated 5 months ago
- CatMAE☆14Updated last year
- ☆11Updated 9 months ago
- ☆10Updated last year
- Official PyTorch implementation of "Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis…☆44Updated last year
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆22Updated 7 months ago
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆19Updated 5 months ago
- ☆11Updated 2 months ago
- Test-Time Distribution Normalization For Contrastively Learned Vision-language Models☆28Updated last year
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"☆44Updated last month
- ☆24Updated last year
- The official repository of paper "ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection" (N…☆50Updated last year
- [CVPR2025] VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding☆16Updated 3 months ago
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆29Updated last year
- Official Implementation for "Editing Massive Concepts in Text-to-Image Diffusion Models"☆19Updated last year
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆21Updated last year
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆41Updated 7 months ago
- ☆11Updated 8 months ago
- [ECCV 2024] Official repository for "DataDream: Few-shot Guided Dataset Generation"☆41Updated 11 months ago
- DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection☆20Updated last year
- Video Diffusion State Space Models☆19Updated last year