Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".
☆238Mar 29, 2025Updated 11 months ago
Alternatives and similar repositories for rlt
Users that are interested in rlt are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality☆261Dec 27, 2024Updated last year
- Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"☆167Nov 5, 2024Updated last year
- ☆47Nov 8, 2024Updated last year
- [ICCV 2025] VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE☆395Jan 19, 2025Updated last year
- [CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression☆45Feb 25, 2026Updated last week
- ☆213Feb 11, 2025Updated last year
- Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation☆17Apr 3, 2024Updated last year
- Subjects200K dataset☆131Jan 17, 2025Updated last year
- A suite of image and video neural tokenizers☆1,714Feb 11, 2025Updated last year
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Official implementation of ViewFusion: Learning Composable Diffusion Models for Novel View Synthesis☆36May 30, 2025Updated 9 months ago
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆38Jan 27, 2026Updated last month
- Official code for NeurIPS 2024 paper LRM-Zero: Training Large Reconstruction Models with Synthesized Data☆153Oct 7, 2024Updated last year
- ☆71Nov 18, 2024Updated last year
- [ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think☆1,560Mar 16, 2025Updated 11 months ago
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆431Nov 10, 2024Updated last year
- [CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models☆102Nov 22, 2025Updated 3 months ago
- A unified inference and post-training framework for accelerated video generation.☆3,127Updated this week
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆649Oct 16, 2024Updated last year
- MR. Video: MapReduce is the Principle for Long Video Understanding☆31Apr 23, 2025Updated 10 months ago
- [ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding☆1,086Jul 6, 2024Updated last year
- [ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Langua…☆558Jan 4, 2025Updated last year
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆631Oct 29, 2025Updated 4 months ago
- Bag of MLP☆20May 31, 2021Updated 4 years ago
- ☆83Oct 31, 2024Updated last year
- This repo contains the code for 1D tokenizer and generator☆1,120Mar 20, 2025Updated 11 months ago
- [CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project☆184Mar 20, 2025Updated 11 months ago
- [NeurIPS 2024] Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models☆339Jan 21, 2025Updated last year
- [ICLR2026] VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling☆511Nov 18, 2025Updated 3 months ago
- [NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising☆212Sep 27, 2025Updated 5 months ago
- A paper list of some recent works about Token Compress for Vit and VLM☆843Mar 3, 2026Updated last week
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆71Oct 17, 2025Updated 4 months ago
- A 3D mesh viewer for vscode☆73Jul 4, 2025Updated 8 months ago
- [ECCV2024] Video Foundation Models & Data for Multimodal Understanding☆2,204Dec 15, 2025Updated 2 months ago
- Pixel-Space Generative Models☆303May 11, 2025Updated 9 months ago
- [ICLR 2025] Official implementation of "DiffSplat: Repurposing Image Diffusion Models for Scalable 3D Gaussian Splat Generation".☆477Feb 25, 2026Updated last week
- Next-Token Prediction is All You Need☆2,367Jan 12, 2026Updated last month
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,937Aug 15, 2024Updated last year
- Lightplane implements a highly memory-efficient differentiable radiance field renderer, and a module for unprojecting features from image…☆285Aug 6, 2024Updated last year