thu-ml / RIFLExLinks
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025)
β679Updated last month
Alternatives and similar repositories for RIFLEx
Users that are interested in RIFLEx are comparing it to the libraries listed below
Sorting:
- Video generation from text&image, 1st-genβ925Updated last month
- Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement π₯β573Updated 5 months ago
- [CVPR'25]Tora: Trajectory-oriented Diffusion Transformer for Video Generationβ1,161Updated 2 weeks ago
- Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Modelsβ913Updated 3 months ago
- The official implementation of RealisDanceβ548Updated 3 weeks ago
- Matrix-Game: Interactive World Foundation Modelβ730Updated last month
- UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformerβ679Updated last month
- [ICLR'25] 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generationβ345Updated last month
- [CVPR 2025 Highlightπ₯] Identity-Preserving Text-to-Video Generation by Frequency Decompositionβ715Updated last week
- β896Updated 6 months ago
- π₯ Official ComfyUI native node for InfiniteYou with FLUXβ159Updated 3 weeks ago
- A Native Multimodal LLM for 3D Generation and Understandingβ417Updated last week
- Official implementation for "Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model" (NeurIPS 2024)β253Updated last month
- [CVPR2025] AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstructionβ433Updated 3 months ago
- Code for SCIS-2025 Paper "UniAnimate: Taming Unified Video Diο¬usion Models for Consistent Human Image Animation".β1,138Updated 2 months ago
- [CVPR 2025] A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generationβ249Updated 3 months ago
- Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple teβ¦β1,082Updated 4 months ago
- Pytorch Implementation of FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing (ICLR 2024)β205Updated last year
- Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Cβ¦β295Updated 8 months ago
- Efficient DiT architecture for text2any tasks, ICLR2025β449Updated last month
- [IJCV] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generationβ1,130Updated 7 months ago
- Official repo of our paper "SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions"β622Updated last year
- [ECCV 2024 Oral] MotionDirector: Motion Customization of Text-to-Video Diffusion Models.β1,011Updated 10 months ago
- Code for paper "Towards Understanding Camera Motions in Any Video"β193Updated 3 weeks ago
- [ICML 2023 Oral, NeurIPS 2023] Official implementations for paper: Customizable Image Synthesis with Multiple Subjectsβ440Updated last year
- [CVPR 2025 Highlight] 3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusionβ975Updated last month
- [NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captionsβ1,063Updated 8 months ago
- [CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformerβ1,252Updated 3 months ago
- β94Updated last year
- Liquid: Language Models are Scalable and Unified Multi-modal Generatorsβ592Updated 2 months ago