Fr0zenCrane / CockatielLinks
The official implementation of our paper "Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption"
☆32Updated 2 weeks ago
Alternatives and similar repositories for Cockatiel
Users that are interested in Cockatiel are comparing it to the libraries listed below
Sorting:
- ☆50Updated 5 months ago
- ☆63Updated 9 months ago
- ☆87Updated this week
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.☆75Updated last month
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆47Updated 8 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆69Updated 10 months ago
- Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing☆62Updated 3 months ago
- ☆33Updated 3 weeks ago
- Official Repo for Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation☆28Updated last year
- [CVPR2025] Official implementation of High Fidelity Scene Text Synthesis.☆63Updated 2 months ago
- ☆97Updated 2 months ago
- EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM☆58Updated 2 months ago
- [ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark☆107Updated last month
- Concat-ID: Towards Universal Identity-Preserving Video Synthesis☆47Updated 3 weeks ago
- Blending Custom Photos with Video Diffusion Transformers☆47Updated 4 months ago
- TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation☆30Updated 6 months ago
- ☆102Updated 11 months ago
- ☆23Updated last month
- [CVPR 2025] A Hierarchical Movie Level Dataset for Long Video Generation☆58Updated 2 months ago
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"☆59Updated 7 months ago
- Finetuning and inference tools for the CogView4 and CogVideoX model series.☆70Updated 3 weeks ago
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆79Updated last year
- [WWW 2025] Official PyTorch Code for "CTR-Driven Advertising Image Generation with Multimodal Large Language Models"☆33Updated 2 months ago
- Official model implementation and benchmark evaluation repository of <AnyEdit: Unified High-Quality Image Edit with Any Idea>☆21Updated 2 months ago
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption 🔍☆42Updated last month
- PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image Generation☆33Updated 7 months ago
- Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation☆117Updated 2 weeks ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆85Updated 10 months ago
- [CVPR2024] The official implementation of paper Relation Rectification in Diffusion Model☆47Updated 8 months ago
- ☆52Updated last month