johncaged / VRoPELinks
Official implementation of VRoPE: Rotary Position Embedding for Video Large Language Models.
☆21Updated last month
Alternatives and similar repositories for VRoPE
Users that are interested in VRoPE are comparing it to the libraries listed below
Sorting:
- Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations☆48Updated this week
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆70Updated 4 months ago
- Official implementation of LaVin-DiT☆34Updated 5 months ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆70Updated last week
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Atten…☆45Updated 4 months ago
- FQGAN: Factorized Visual Tokenization and Generation☆49Updated 2 months ago
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆21Updated 2 months ago
- official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation☆35Updated 3 months ago
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆47Updated 6 months ago
- ☆32Updated last week
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation☆126Updated last month
- SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding☆48Updated 3 weeks ago
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller☆41Updated 2 months ago
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆22Updated last year
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆35Updated 4 months ago
- [CVPR 25] A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.☆30Updated 2 months ago
- Official code for MotionBench (CVPR 2025)☆45Updated 3 months ago
- VideoAuteur: Towards Long Narrative Video Generation☆42Updated 5 months ago
- ☆17Updated 2 weeks ago
- The official repository of "Spectral Motion Alignment for Video Motion Transfer using Diffusion Models".☆27Updated 6 months ago
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories☆54Updated 3 months ago
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆12Updated 2 months ago
- VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation☆20Updated 3 months ago
- Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025☆55Updated 3 months ago
- Autoregressive Image Generation with Randomized Parallel Decoding☆67Updated 2 months ago
- Curated list of recent visual autoregressive (VAR) modeling works☆29Updated 3 months ago
- DreamGaussian with 2D-GS☆12Updated 8 months ago
- VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models☆51Updated 3 weeks ago
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆24Updated 8 months ago
- Official Implementation of VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention☆39Updated 2 months ago