rccchoudhury / rltView external linksLinks
Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".
☆235Mar 29, 2025Updated 10 months ago
Alternatives and similar repositories for rlt
Users that are interested in rlt are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality☆259Dec 27, 2024Updated last year
- Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"☆166Nov 5, 2024Updated last year
- ☆46Nov 8, 2024Updated last year
- [ICCV 2025] VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE☆388Jan 19, 2025Updated last year
- Accelerating Streaming Video Large Language Models via Hierarchical Token Compression☆42Jan 6, 2026Updated last month
- ☆213Feb 11, 2025Updated last year
- Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation☆17Apr 3, 2024Updated last year
- Subjects200K dataset☆129Jan 17, 2025Updated last year
- A suite of image and video neural tokenizers☆1,704Feb 11, 2025Updated last year
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Official implementation of ViewFusion: Learning Composable Diffusion Models for Novel View Synthesis☆36May 30, 2025Updated 8 months ago
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆38Jan 27, 2026Updated 2 weeks ago
- Official code for NeurIPS 2024 paper LRM-Zero: Training Large Reconstruction Models with Synthesized Data☆153Oct 7, 2024Updated last year
- ☆71Nov 18, 2024Updated last year
- [ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think☆1,544Mar 16, 2025Updated 10 months ago
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆432Nov 10, 2024Updated last year
- [CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models☆99Nov 22, 2025Updated 2 months ago
- A unified inference and post-training framework for accelerated video generation.☆3,059Feb 7, 2026Updated last week
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆648Oct 16, 2024Updated last year
- MR. Video: MapReduce is the Principle for Long Video Understanding☆29Apr 23, 2025Updated 9 months ago
- [ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Langua…☆553Jan 4, 2025Updated last year
- [ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding☆1,076Jul 6, 2024Updated last year
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆625Oct 29, 2025Updated 3 months ago
- Bag of MLP☆20May 31, 2021Updated 4 years ago
- ☆83Oct 31, 2024Updated last year
- This repo contains the code for 1D tokenizer and generator☆1,113Mar 20, 2025Updated 10 months ago
- [CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project☆184Mar 20, 2025Updated 10 months ago
- [NeurIPS 2024] Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models☆334Jan 21, 2025Updated last year
- [ICLR2026] VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling☆503Nov 18, 2025Updated 2 months ago
- [NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising☆212Sep 27, 2025Updated 4 months ago
- A paper list of some recent works about Token Compress for Vit and VLM☆828Updated this week
- Pixel-Space Generative Models☆301May 11, 2025Updated 9 months ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆70Oct 17, 2025Updated 3 months ago
- [ECCV2024] Video Foundation Models & Data for Multimodal Understanding☆2,196Dec 15, 2025Updated 2 months ago
- [ICLR 2025] Official implementation of "DiffSplat: Repurposing Image Diffusion Models for Scalable 3D Gaussian Splat Generation".☆470Aug 27, 2025Updated 5 months ago
- A 3D mesh viewer for vscode☆73Jul 4, 2025Updated 7 months ago
- Next-Token Prediction is All You Need☆2,339Jan 12, 2026Updated last month
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,932Aug 15, 2024Updated last year
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding☆507Nov 14, 2025Updated 3 months ago