NVlabs / TokenBench
A Video Tokenizer Evaluation Dataset
☆48Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for TokenBench
- Official PyTorch implmentation of paper "T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stitching"☆96Updated 8 months ago
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆78Updated 2 weeks ago
- ☆31Updated 3 weeks ago
- ElasticTok: Adaptive Tokenization for Image and Video☆33Updated 2 weeks ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆30Updated 4 months ago
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"☆20Updated this week
- The official implementation of Diffusion-KTO: Aligning Diffusion Models by Optimizing Human Utility☆27Updated 3 weeks ago
- Official implementation of "Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization"☆75Updated 7 months ago
- [arXiv:2406.07548] Image and Video Tokenization with Binary Spherical Quantization☆84Updated 5 months ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆23Updated 10 months ago
- 🔥 Aurora Series: A more efficient multimodal large language model series for video.☆47Updated last week
- [NeurIPS 2024] Efficient Multi-modal Models via Stage-wise Visual Context Compression☆41Updated 3 months ago
- Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image …☆55Updated last month
- Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding☆23Updated 2 weeks ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆76Updated last month
- ☆17Updated 5 months ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆31Updated last week
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆52Updated last year
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆32Updated 5 months ago
- ☆43Updated 7 months ago
- This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆121Updated 5 months ago
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆78Updated 7 months ago
- ☆60Updated last year
- Official implementation of Aurora☆81Updated last year
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆42Updated 3 weeks ago
- [CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"☆60Updated 6 months ago
- An in-context conditioning version of MUSE with pre-trained checkpoints.☆111Updated last year
- Minimal multi-gpu implementation of EDM2: "Analyzing and Improving the Training Dynamics of Diffusion Models"☆26Updated 8 months ago
- TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation☆26Updated 2 weeks ago
- PyTorch implementation of "UNIT: Unifying Image and Text Recognition in One Vision Encoder", NeurlPS 2024.☆20Updated last month