(NeurIPS 2025) Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation
☆64Oct 14, 2025Updated 4 months ago
Alternatives and similar repositories for VFMTok
Users that are interested in VFMTok are comparing it to the libraries listed below
Sorting:
- Exploring Representation-Aligned Latent Space for Better Generation☆17Feb 4, 2025Updated last year
- ☆22Mar 7, 2025Updated 11 months ago
- An innovative method designed to augment the capabilities of existing video diffusion models☆22May 10, 2024Updated last year
- An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control☆31Jan 13, 2026Updated last month
- DNN assisted Kalman filter for time domain speech enhancement☆21Nov 18, 2020Updated 5 years ago
- The official repository of "Spectral Motion Alignment for Video Motion Transfer using Diffusion Models".☆31Dec 13, 2024Updated last year
- Official repository for "Unveiling Opinion Evolution via Prompting and Diffusion for Short Video Fake News Detection", ACL Findings 2024.☆14Apr 25, 2025Updated 10 months ago
- Continual Resilient (CoRe) Optimizer for PyTorch☆11Jun 10, 2024Updated last year
- Transferring Genshin PVs into a freehand style with Diffusion Model.☆10Jun 5, 2024Updated last year
- 中国矿业大学本科毕业论文word模板2023版☆12Mar 29, 2023Updated 2 years ago
- SimX-OR: Extending Any Simulation Benchmark to Evaluate the Observational Robustness of VLA Models☆31Nov 4, 2025Updated 3 months ago
- [ICCV 2025] Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction☆52Sep 22, 2025Updated 5 months ago
- ☆15Sep 16, 2024Updated last year
- Documentation at☆14Mar 27, 2025Updated 11 months ago
- [IJCAI 2025] Offical implementation of the paper "Multi-View Learning with Context-Guided Receptance for Image Denoising".☆12Jun 26, 2025Updated 8 months ago
- ☆12Jun 17, 2019Updated 6 years ago
- ☆13Aug 28, 2024Updated last year
- [ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxi…☆250May 5, 2024Updated last year
- Towards Scalable Pre-training of Visual Tokenizers for Generation☆445Dec 16, 2025Updated 2 months ago
- ☆10Sep 27, 2019Updated 6 years ago
- Sklearn-ranking is ranking algorithm used for recommendation system algorithm. RANKSVM, RANKBOOST, RANKNET is included in this package☆14May 20, 2020Updated 5 years ago
- ☆13Apr 10, 2025Updated 10 months ago
- DeepEarth: AI Foundation Model for Planetary Science & Sustainability☆26Updated this week
- Code release for "MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning"☆11Oct 11, 2024Updated last year
- Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement☆10Jan 24, 2022Updated 4 years ago
- Gaussian Splating 2d implemented in triton☆11Mar 19, 2024Updated last year
- ☆11Dec 15, 2025Updated 2 months ago
- 基于X86架构的简单Cminus语言编译器☆10Apr 1, 2022Updated 3 years ago
- ☆11Sep 16, 2025Updated 5 months ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆23Feb 11, 2026Updated 2 weeks ago
- ☆16Oct 13, 2025Updated 4 months ago
- ☆10Feb 20, 2023Updated 3 years ago
- calculate bhattacharyya distance based on zero cross rate feature between different Gaussian model for speech emotion recognition. corpus…☆11Oct 17, 2018Updated 7 years ago
- Deep-Blind-Super-Resolution-for-Hyperspectral-Images☆11Sep 9, 2024Updated last year
- 完整基于omlsa.m实现☆14Nov 26, 2021Updated 4 years ago
- [ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates☆40Updated this week
- ☆13Sep 7, 2023Updated 2 years ago
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 7 months ago
- ☆14Apr 18, 2023Updated 2 years ago