MKJia / MGVQLinks
[Arxiv'25] MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization
☆43Updated last week
Alternatives and similar repositories for MGVQ
Users that are interested in MGVQ are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Where Am I and What Will I See : An Auto-Regressive Model for Spatial Localization and View Prediction☆39Updated last month
- ☆95Updated 3 months ago
- [CVPR 2024] BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation☆46Updated last year
- UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding☆50Updated last month
- ☆34Updated last year
- open-sourced video dataset with dynamic scenes and camera movements annotation☆75Updated 4 months ago
- [ICLR 2025] MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow☆23Updated 5 months ago
- ☆25Updated 5 months ago
- [SIGGRAPH Asia 2025] WorldExplorer: Towards Generating Fully Navigable 3D Scenes☆70Updated this week
- UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation☆127Updated 3 months ago
- [EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation☆21Updated 3 months ago
- Project Page for GaussianFormer☆24Updated last year
- Pytorch implementation of GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting☆94Updated 5 months ago
- Seeing World Dynamics in a Nutshell☆109Updated 6 months ago
- StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams☆63Updated 3 months ago
- WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes☆99Updated 6 months ago
- [ICCV 2025] This is the official implementation of POMATO: Marrying Pointmap Matching with Temporal Motions for Dynamic 3D Reconstruction☆93Updated last month
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction☆263Updated 2 weeks ago
- [ICCV 2025] Official implementation of "What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?"☆12Updated last month
- [ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control☆78Updated 2 months ago
- [NeurIPS2024] DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion☆36Updated 11 months ago
- Scaling Properties of Diffusion Models For Perceptual Tasks (CVPR 2025)☆42Updated 4 months ago
- Code implementation for: From Virtual Games to Real-World Play☆38Updated 2 months ago
- [ICCV 2025] Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis☆86Updated last month
- Official implementation of paper "Pyramid Diffusion for Fine 3D Large Scene Generation" (ECCV 2024 Oral)☆126Updated 5 months ago
- Streaming 3D Reconstruction with Explicit Spatial Pointer Memory☆144Updated 2 months ago
- This is the project page of ShowRoom3D☆25Updated last year
- Self-reimplemented version of 4D-LRM.☆53Updated 3 months ago
- Code for Faster VGGT with Block-Sparse Global Attention☆72Updated this week
- [ICCV 2025] InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models☆67Updated 2 months ago