ali-vilab / TTS-VARView external linksLinks
Test-time Scaling for VAR models
☆31Sep 19, 2025Updated 4 months ago
Alternatives and similar repositories for TTS-VAR
Users that are interested in TTS-VAR are comparing it to the libraries listed below
Sorting:
- ☆33Jul 15, 2025Updated 6 months ago
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 4 months ago
- [AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updates☆22Jul 1, 2025Updated 7 months ago
- MegaRAG: Multimodal Graph-based RAG☆32Sep 16, 2025Updated 4 months ago
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆29Sep 19, 2025Updated 4 months ago
- UniEval: Unified Holistic Evaluation for Unified Multimodal Understanding and Generation☆22May 16, 2025Updated 8 months ago
- ☆63Jul 11, 2025Updated 7 months ago
- ☆17Dec 8, 2024Updated last year
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- ☆25Jun 18, 2025Updated 7 months ago
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated 11 months ago
- Hands-On Image Processing with Python, Second Edition, Published by Packt☆26Updated this week
- ☆28Jul 8, 2025Updated 7 months ago
- The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"☆19Jan 25, 2025Updated last year
- ☆15Feb 21, 2024Updated last year
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated 11 months ago
- Chinese-native image generation while compatible with SD eco-system, 1st-gen, AAAI2025☆13Jun 25, 2024Updated last year
- ☆27Feb 7, 2025Updated last year
- EraseAnything, ICML 2025☆38Sep 28, 2025Updated 4 months ago
- ☆39Jul 23, 2025Updated 6 months ago
- ☆17Jul 30, 2024Updated last year
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆66Dec 8, 2025Updated 2 months ago
- ☆31Sep 12, 2025Updated 5 months ago
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Jul 17, 2025Updated 6 months ago
- Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments☆48Jan 8, 2026Updated last month
- The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight☆78Jan 16, 2026Updated 3 weeks ago
- Resa: Transparent Reasoning Models via SAEs☆47Sep 23, 2025Updated 4 months ago
- [IJCV 2026] HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts☆26Feb 28, 2025Updated 11 months ago
- ☆60Jan 12, 2026Updated last month
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆59Feb 6, 2026Updated last week
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆13Jun 28, 2025Updated 7 months ago
- ☆28May 24, 2025Updated 8 months ago
- The official UniVerse-1 code.☆119Oct 13, 2025Updated 4 months ago
- This repo contains the python code as well as the webpage html files for the Spice-E project from VAILab at TAU.☆26Dec 9, 2024Updated last year
- [ICLR 2025] HQ-Edit: A High-Quality and High-Coverage Dataset for General Image Editing☆113Apr 18, 2024Updated last year
- ☆54Jul 7, 2025Updated 7 months ago
- A unified framework for controllable caption generation across images, videos, and audio. Supports multi-modal inputs and customizable ca…☆52Jul 24, 2025Updated 6 months ago
- [ICLR'25 Oral] MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models☆35Nov 3, 2024Updated last year
- ☆39May 20, 2025Updated 8 months ago