johncaged / VRoPELinks

[EMNLP 2025 Main] Official implementation of VRoPE: Rotary Position Embedding for Video Large Language Models.

☆27

Alternatives and similar repositories for VRoPE

Users that are interested in VRoPE are comparing it to the libraries listed below

Sorting:

TencentARC / Divot
Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
☆85Updated 9 months ago
DerrickWang005 / LaVin-DiT
Official implementation of LaVin-DiT
☆49Updated 10 months ago
Eyeline-Labs / VChain
The official implementation of paper “VChain: Chain-of-Visual-Thought for Reasoning in Video Generation”
☆109Updated 2 months ago
lambert-x / VideoAuteur
VideoAuteur: Towards Long Narrative Video Generation
☆43Updated last month
Yikai-Wang / nvg
Code for our paper "Next Visual Granularity Generation".
☆48Updated 2 months ago
KaiyueSun98 / T2I-ReasonBench
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
☆34Updated 3 months ago
riccizz / HRF
☆14Updated 7 months ago
SAIS-FUXI / Omni-Video
☆67Updated 4 months ago
zelaki / ReDi
[NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesis
☆105Updated last month
Jialuo-Li / Science-T2I
[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
☆62Updated 7 months ago
Jiawei-Yang / DeTok
Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"
☆164Updated last month
tyshiwo1 / Accelerating-T2I-AR-with-SJD
[ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
☆47Updated 8 months ago
KlingTeam / PhysMaster
Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning
☆53Updated 2 months ago
tliby / UniFork
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation
☆46Updated 3 months ago
jialuli-luka / Video-MSG
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
☆24Updated 8 months ago
cfeng16 / GPS2Pix
[CVPR 2025] GPS as a Control Signal for Image Generation
☆24Updated 9 months ago
yisuanwang / DanceTog
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation
☆37Updated 4 months ago
IDEA-Research / TOSS
[ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"
☆22Updated last year
Vchitect / Uni-MMMU
☆20Updated last week
FrankYang-17 / Mavors
☆15Updated 6 months ago
florinshen / Vista3D
[ECCV2024] Vista3D: Unravel the 3D Darkside of a Single Image
☆55Updated last year
baaivision / URSA
🐻 Uniform Discrete Diffusion with Metric Path for Video Generation
☆81Updated last week
Yu-xm / Unicorn
Text-Only Data Synthesis for Vision Language Model Training
☆22Updated 6 months ago
ant-research / lumos
[CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Text
☆53Updated 9 months ago
NVlabs / QLIP
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
☆94Updated 9 months ago
aniki-ly / FreeLong
[NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Atten…
☆64Updated 5 months ago
huang-yh / Owl
☆51Updated last year
huawei-lin / VTBench
This repository provides the official implementation of VTBench, a benchmark designed to evaluate the performance of visual tokenizers (V…
☆34Updated 4 months ago
showlab / FQGAN
FQGAN: Factorized Visual Tokenization and Generation
☆56Updated 8 months ago
oooolga / Ctrl-V
👆Pytorch implementation of "Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion"
☆30Updated 4 months ago