RenShuhuai-Andy / NBP
Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
☆26Updated last month
Alternatives and similar repositories for NBP:
Users that are interested in NBP are comparing it to the libraries listed below
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆61Updated 3 weeks ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆34Updated last month
- The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"☆30Updated last week
- HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆50Updated last month
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"☆26Updated last month
- [ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding☆29Updated 3 weeks ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆45Updated 4 months ago
- A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.☆17Updated last week
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"☆51Updated 5 months ago
- [ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark☆86Updated 2 months ago
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆56Updated last month
- ☆33Updated last month
- ☆29Updated 7 months ago
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆64Updated 4 months ago
- Autoregressive Image Generation with Randomized Parallel Decoding☆25Updated last week
- FQGAN: Factorized Visual Tokenization and Generation☆45Updated 2 months ago
- ☆53Updated this week
- This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"☆46Updated this week
- Code for paper "Principal Components" Enable A New Language of Images☆23Updated this week
- TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation☆49Updated last week
- Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025☆40Updated 2 weeks ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆86Updated 5 months ago
- Official implementation of LaVin-DiT☆26Updated 2 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆30Updated this week
- Official Repository of Personalized Visual Instruct Tuning☆28Updated 3 weeks ago
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆27Updated 4 months ago
- This is the official repo for ByteVideoLLM/Dynamic-VLM☆20Updated 3 months ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆35Updated 9 months ago
- [Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models☆28Updated 2 weeks ago
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT☆54Updated last week