NVlabs / TokenBench
A Video Tokenizer Evaluation Dataset
☆101Updated last month
Alternatives and similar repositories for TokenBench:
Users that are interested in TokenBench are comparing it to the libraries listed below
- ElasticTok: Adaptive Tokenization for Image and Video☆54Updated 3 months ago
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆103Updated last week
- ☆112Updated last month
- 🦾 EvalGIM (pronounced as "EvalGym") is an evaluation library for generative image models. It enables easy-to-use, reproducible automatic…☆66Updated 2 months ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆84Updated 4 months ago
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆50Updated last week
- ☆139Updated 2 months ago
- This is a PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Framework for Cross-Modality Evolu…☆131Updated 2 weeks ago
- [ICLR 2025] Official PyTorch implmentation of paper "T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stit…☆99Updated 11 months ago
- An in-context conditioning version of MUSE with pre-trained checkpoints.☆111Updated last year
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆78Updated 2 weeks ago
- ☆116Updated last year
- The official implementation of PAR: Parallelized Autoregressive Visual Generation. https://epiphqny.github.io/PAR-project/☆110Updated last month
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"☆25Updated last week
- ☆23Updated 2 weeks ago
- Benchmarking physical understanding in generative video models☆116Updated this week
- [ICLR 2025][arXiv:2406.07548] Image and Video Tokenization with Binary Spherical Quantization☆127Updated 8 months ago
- ☆189Updated last week
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆229Updated 3 weeks ago
- [ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark☆75Updated 3 weeks ago
- ☆71Updated 4 months ago
- [CVPR 2024] On the Content Bias in Fréchet Video Distance☆103Updated 4 months ago
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆47Updated 7 months ago
- SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training☆174Updated 3 weeks ago
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆79Updated 10 months ago
- ☆62Updated 6 months ago
- [ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation☆243Updated last week