Fr0zenCrane / CockatielLinks
The official implementation of our paper "Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption"
β38Updated 8 months ago
Alternatives and similar repositories for Cockatiel
Users that are interested in Cockatiel are comparing it to the libraries listed below
Sorting:
- Official model implementation and benchmark evaluation repository of <AnyEdit: Unified High-Quality Image Edit with Any Idea>β30Updated 6 months ago
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption πβ46Updated 6 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generationβ85Updated last year
- Video dataset dedicated to portrait-mode video recognition.β55Updated 3 months ago
- This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA benchmark performβ¦β84Updated 4 months ago
- β141Updated 3 months ago
- [CVPR 2025 AI4CC Workshop] Official Implementation of HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editinβ¦β35Updated 8 months ago
- β132Updated 6 months ago
- β53Updated last year
- (ICCV2025) EEditβ‘: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editingβ60Updated 4 months ago
- β51Updated 8 months ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesisβ86Updated last year
- Glance: Accelerating Diffusion Models with 1 Sampleβ150Updated 3 weeks ago
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.β84Updated 8 months ago
- EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modelingβ201Updated 2 months ago
- An Efficient Text-to-Image Generation Pretrain Pipelineβ129Updated 9 months ago
- T2VScore: Towards A Better Metric for Text-to-Video Generationβ80Updated last year
- ShotBench: Expert-Level Cinematic Understanding in Vision-Language Modelsβ89Updated 4 months ago
- [ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedbackβ59Updated last year
- [CVPR 2025] A Hierarchical Movie Level Dataset for Long Video Generationβ80Updated 10 months ago
- Chinese-native image generation while compatible with SD eco-system, 1st-gen, AAAI2025β13Updated last year
- Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editingβ71Updated 6 months ago
- A light-weight and high-efficient training framework for accelerating diffusion tasks.β51Updated last year
- [CVPR2025] Official implementation of High Fidelity Scene Text Synthesis.β79Updated 9 months ago
- [CVPR 2025] PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generationβ44Updated 6 months ago
- [ICCV 2025] The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"β58Updated 9 months ago
- β55Updated 6 months ago
- Jodi: Unification of Visual Generation and Understanding via Joint Modelingβ90Updated 7 months ago
- Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Rouβ¦β32Updated 3 months ago
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)β80Updated 8 months ago