ArtmeScienceLab / FonTSLinks
[ICCV 2025] FonTS: Text Rendering with Typography and Style Controls
β32Updated last week
Alternatives and similar repositories for FonTS
Users that are interested in FonTS are comparing it to the libraries listed below
Sorting:
- π This is a repository for organizing papers, codes, and other resources related to unified multimodal models.β323Updated 2 weeks ago
- Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasingββ57Updated 4 months ago
- A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstratingβ¦β123Updated 3 weeks ago
- β39Updated 7 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generationβ227Updated 2 months ago
- Official implementation of MC-LLaVA.β140Updated 2 months ago
- Collections of Papers and Projects for Multimodal Reasoning.β105Updated 6 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)β184Updated 3 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generationβ159Updated last month
- Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)β57Updated 6 months ago
- Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unifieβ¦β293Updated 2 weeks ago
- π₯π₯π₯ Latest Papers, Codes and Datasets on Video-LMM Post-Trainingβ142Updated this week
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".β66Updated 3 months ago
- UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generationβ107Updated last week
- An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"β95Updated last month
- [CVPR 2025] π₯ Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".β395Updated 2 months ago
- [ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videosβ88Updated last month
- [NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editingβ113Updated last week
- Doodling our way to AGI βοΈ πΌοΈ π§β109Updated 5 months ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGIβ187Updated 2 weeks ago
- Survey: https://arxiv.org/pdf/2507.20198β186Updated last week
- A tiny paper rating webβ39Updated 7 months ago
- Official code for DeepSound-V1β12Updated 5 months ago
- Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Thinkβ578Updated last week
- TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoningβ105Updated 5 months ago
- A Collection of Papers on Diffusion Language Modelsβ134Updated last month
- β22Updated 2 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thoughtβ90Updated last week
- [ICML 2025] This is the official PyTorch implementation of "π΅ HarmoniCa: Harmonizing Training and Inference for Better Feature Caching iβ¦β43Updated 3 months ago
- π₯CVPR 2025 Multimodal Large Language Models Paper Listβ156Updated 7 months ago