Fr0zenCrane / CockatielLinks
The official implementation of our paper "Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption"
☆38Updated 6 months ago
Alternatives and similar repositories for Cockatiel
Users that are interested in Cockatiel are comparing it to the libraries listed below
Sorting:
- Official model implementation and benchmark evaluation repository of <AnyEdit: Unified High-Quality Image Edit with Any Idea>☆30Updated 4 months ago
- Video dataset dedicated to portrait-mode video recognition.☆55Updated last month
- ☆135Updated last month
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.☆84Updated 7 months ago
- ☆130Updated 5 months ago
- [CVPR 2025 AI4CC Workshop] Official Implementation of HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editin…☆35Updated 7 months ago
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption 🔍☆46Updated 5 months ago
- [CVPR2025] Official implementation of High Fidelity Scene Text Synthesis.☆78Updated 8 months ago
- ☆49Updated 7 months ago
- ☆52Updated 11 months ago
- Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Rou…☆27Updated 2 months ago
- Official implementation of "HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment"☆91Updated 7 months ago
- An Efficient Text-to-Image Generation Pretrain Pipeline☆122Updated 7 months ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆86Updated last year
- Blending Custom Photos with Video Diffusion Transformers☆48Updated 10 months ago
- [ICML 2025] EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM☆69Updated 4 months ago
- ☆77Updated last month
- This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA benchmark perform…☆79Updated 2 months ago
- ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models☆86Updated 2 months ago
- Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning☆189Updated last week
- [ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback☆59Updated last year
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆50Updated last year
- ☆47Updated 5 months ago
- [ICCV 2025] Code & Data for: SuperEdit - Rectifying and Facilitating Supervision for Instruction-Based Image Editing☆163Updated 5 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆84Updated last year
- Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?☆37Updated this week
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆39Updated last year
- [CVPR 2025] PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation☆42Updated 5 months ago
- Concat-ID: Towards Universal Identity-Preserving Video Synthesis☆64Updated 7 months ago
- (ICCV2025) EEdit⚡: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing☆58Updated 2 months ago