rccchoudhury/rlt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rccchoudhury/rlt)

rccchoudhury / rlt

Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".

☆238

Alternatives and similar repositories for rlt

Users that are interested in rlt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Vchitect / FasterCache
View on GitHub
[ICLR 2025] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
☆263Dec 27, 2024Updated last year
AdaCache-DiT / AdaCache
View on GitHub
Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"
☆172Nov 5, 2024Updated last year
locuslab / llava-token-compression
View on GitHub
☆47Nov 8, 2024Updated last year
wenhaochai / claude-plugins
View on GitHub
Personal Claude Code plugin marketplace
☆16Updated this week
zhang9302002 / ThinkingWithVideos
View on GitHub
The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"
☆102Oct 15, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
NVlabs / AutoGaze
View on GitHub
AutoGaze automatically removes redundant patches in a video, reducing #tokens in ViT/MLLM by 4x-100x.
☆297May 5, 2026Updated 2 months ago
ziqipang / MR-Video
View on GitHub
MR. Video: MapReduce is the Principle for Long Video Understanding
☆31Jun 18, 2026Updated last month
KD-TAO / DyCoke
View on GitHub
[CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
☆113Nov 22, 2025Updated 7 months ago
Yuanshi9815 / Subjects200K
View on GitHub
Subjects200K dataset
☆132Jan 17, 2025Updated last year
mlvlab / vid-TLDR
View on GitHub
Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".
☆55Oct 21, 2025Updated 9 months ago
xuyang-liu16 / GlobalCom2
View on GitHub
[AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models
☆42Jan 27, 2026Updated 5 months ago
VideoVerses / VideoVAEPlus
View on GitHub
[ICCV 2025] VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE
☆409Jan 19, 2025Updated last year
microsoft / Reducio-VAE
View on GitHub
☆217Feb 11, 2025Updated last year
contrastive / FreeVideoLLM
View on GitHub
☆83Oct 31, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
NVIDIA / Cosmos-Tokenizer
View on GitHub
A suite of image and video neural tokenizers
☆1,731Feb 11, 2025Updated last year
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
camenduru / GRM
View on GitHub
Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
☆17Apr 3, 2024Updated 2 years ago
zhaoyue-zephyrus / AVION
View on GitHub
[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"
☆138Aug 23, 2025Updated 10 months ago
YuqingWang1029 / PAR
View on GitHub
[CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project
☆186Mar 20, 2025Updated last year
Andy-Cheng / TEMPURA
View on GitHub
TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…
☆27Jun 4, 2025Updated last year
GuoTianYu2000 / Active-Dormant-Attention
View on GitHub
codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"
☆11Dec 30, 2024Updated last year
Visual-AI / PruneVid
View on GitHub
[ACL 2025] PruneVid: Visual Token Pruning for Efficient Video Large Language Models
☆72May 15, 2025Updated last year
facebookresearch / ToMe
View on GitHub
A method to increase the speed and lower the memory footprint of existing vision transformers.
☆1,208Jun 17, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
microsoft / VidTok
View on GitHub
a family of versatile and state-of-the-art video tokenizers.
☆453Sep 1, 2025Updated 10 months ago
OpenGVLab / VideoChat-Flash
View on GitHub
[ICLR2026] VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
☆527Updated this week
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆50Oct 9, 2025Updated 9 months ago
OpenGVLab / VideoMamba
View on GitHub
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
☆1,120Jul 6, 2024Updated 2 years ago
gls0425 / LinVT
View on GitHub
LinVT: Empower Your Image-level Large Language Model to Understand Videos
☆84Dec 30, 2024Updated last year
tulerfeng / Video-R1
View on GitHub
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
☆879Dec 14, 2025Updated 7 months ago
pkunlp-icler / FastV
View on GitHub
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Langua…
☆592Jan 4, 2025Updated last year
czg1225 / AsyncDiff
View on GitHub
[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
☆215Sep 27, 2025Updated 9 months ago
desaixie / zeroverse
View on GitHub
Official code for NeurIPS 2024 paper LRM-Zero: Training Large Reconstruction Models with Synthesized Data
☆155Oct 7, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Espere-1119-Song / Video-MMLU
View on GitHub
A Massive Multi-Discipline Lecture Understanding Benchmark
☆34Apr 20, 2026Updated 3 months ago
apple / ml-flextok
View on GitHub
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length
☆322Jun 2, 2025Updated last year
OpenGVLab / InternVideo
View on GitHub
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
☆2,339Jul 2, 2026Updated 2 weeks ago
whlzy / FiT
View on GitHub
[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model
☆434Nov 10, 2024Updated last year
daixiangzi / Awesome-Token-Compress
View on GitHub
A paper list of some recent works about Token Compress for Vit and VLM
☆939Updated this week
Wiselnn570 / VideoRoPE
View on GitHub
[ICML 2025 Oral] An official implementation of VideoRoPE & VideoRoPE++
☆223Apr 15, 2026Updated 3 months ago
NVlabs / TokenBench
View on GitHub
A Video Tokenizer Evaluation Dataset
☆157Jan 13, 2025Updated last year