Fr0zenCrane / CockatielLinks
The official implementation of our paper "Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption"
☆37Updated 4 months ago
Alternatives and similar repositories for Cockatiel
Users that are interested in Cockatiel are comparing it to the libraries listed below
Sorting:
- Video dataset dedicated to portrait-mode video recognition.☆52Updated 10 months ago
- Official model implementation and benchmark evaluation repository of <AnyEdit: Unified High-Quality Image Edit with Any Idea>☆28Updated 2 months ago
- ☆129Updated 3 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆80Updated last year
- Nano-consistent-150k☆194Updated 3 weeks ago
- ☆129Updated 3 months ago
- ☆78Updated 7 months ago
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆49Updated last year
- [CVPR2025] Official implementation of High Fidelity Scene Text Synthesis.☆70Updated 6 months ago
- [CVPR 2025 AI4CC Workshop] Official Implementation of HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editin…☆33Updated 5 months ago
- ☆51Updated 9 months ago
- [WWW 2025] Official PyTorch Code for "CTR-Driven Advertising Image Generation with Multimodal Large Language Models"☆56Updated 2 months ago
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption 🔍☆47Updated 3 months ago
- An Efficient Text-to-Image Generation Pretrain Pipeline☆116Updated 5 months ago
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.☆83Updated 5 months ago
- [ICML 2025] EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM☆68Updated 2 months ago
- ☆35Updated 3 months ago
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆79Updated last year
- Official implementation of "HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment"☆75Updated 5 months ago
- ☆46Updated 5 months ago
- DiT for VAE (and Video Generation)☆35Updated last year
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆86Updated last year
- TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes☆80Updated last month
- Blending Custom Photos with Video Diffusion Transformers☆48Updated 8 months ago
- [CVPR 2025] A Hierarchical Movie Level Dataset for Long Video Generation☆72Updated 6 months ago
- [CVPR 2025] PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation☆41Updated 3 months ago
- [NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models☆164Updated last year
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)☆80Updated 5 months ago
- Concat-ID: Towards Universal Identity-Preserving Video Synthesis☆62Updated 5 months ago
- Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing☆69Updated 2 months ago