Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"
☆52Dec 5, 2024Updated last year
Alternatives and similar repositories for VideoRepair
Users that are interested in VideoRepair are comparing it to the libraries listed below
Sorting:
- (EMNLP 2025 Main) RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives☆37Dec 20, 2025Updated 2 months ago
- Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Finding]"☆15Aug 27, 2025Updated 6 months ago
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆56Jul 1, 2025Updated 8 months ago
- [AAAI 2026] Official implementation of DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation☆78Jun 11, 2025Updated 8 months ago
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆24Aug 13, 2024Updated last year
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- Official implementation of "Real-SRGD: Enhancing Real-World Image Super-Resolution with Classifier-Free Guided Diffusion" [ACCV2024]☆18Dec 9, 2024Updated last year
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆53Jan 22, 2025Updated last year
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- ☆18Oct 28, 2025Updated 4 months ago
- Official Code for the NeurIPS'23 paper "3D-Aware Visual Question Answering about Parts, Poses and Occlusions"☆19Oct 17, 2024Updated last year
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆25Apr 14, 2025Updated 10 months ago
- (NeurIPS 2024) BiDM: Pushing the Limit of Quantization for Diffusion Models☆22Nov 20, 2024Updated last year
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.☆85May 4, 2025Updated 9 months ago
- Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models (ICLR 2026)☆42Feb 18, 2026Updated last week
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆43Mar 11, 2025Updated 11 months ago
- ☆11Jan 18, 2025Updated last year
- Official implementation for "LOVECon: Text-driven Training-free Long Video Editing with ControlNet"☆43Oct 26, 2023Updated 2 years ago
- ☆27Jun 4, 2024Updated last year
- [NeurIPS'2024] Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps☆101Jul 4, 2024Updated last year
- ☆28Apr 8, 2025Updated 10 months ago
- code for "TVG: A Training-free Transition Video Generation Method with Diffusion Models"☆48Aug 19, 2024Updated last year
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆48Jul 3, 2025Updated 7 months ago
- Code for "BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation [ICML2024]".☆46Jun 16, 2024Updated last year
- [NeurIPS 2024] HiCoM: Hierarchical Coherent Motion for Dynamic Streamable Scenes with 3D Gaussian Splatting☆45Dec 24, 2024Updated last year
- Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…☆24Jun 13, 2024Updated last year
- ☆11Nov 30, 2025Updated 3 months ago
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research☆22Sep 23, 2025Updated 5 months ago
- ☆11Aug 7, 2025Updated 6 months ago
- Official implementation of Nemesis: Normalizing the Soft-prompt Vectors of Vision-Language Models (ICLR 2024 Spotlight)☆15Dec 27, 2024Updated last year
- ☆10Jul 5, 2024Updated last year
- [arXiv 2024] I4VGen: Image as Free Stepping Stone for Text-to-Video Generation☆24Oct 6, 2024Updated last year
- ☆30Nov 7, 2023Updated 2 years ago
- Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.☆34Feb 13, 2025Updated last year
- ☆11Sep 28, 2024Updated last year
- This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"☆13Aug 22, 2025Updated 6 months ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 5 months ago
- [AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updates☆22Jul 1, 2025Updated 8 months ago