AnonymousDUTAI / SREKCARC-IA-TUDLinks
☆20Updated last year
Alternatives and similar repositories for SREKCARC-IA-TUD
Users that are interested in SREKCARC-IA-TUD are comparing it to the libraries listed below
Sorting:
- A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstrating…☆121Updated last month
- ☆59Updated 2 months ago
- Official implementation of MC-LLaVA.☆140Updated last month
- A collection of vision foundation models unifying understanding and generation.☆57Updated 8 months ago
- A vue-based project page template for academic papers. (in development) https://junyaohu.github.io/academic-project-page-template-vue☆290Updated 2 months ago
- A tiny paper rating web☆39Updated 6 months ago
- Fundamentals of Digital Media Technology(04713901) | Peking University ECE Course Materials☆22Updated 3 years ago
- BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models☆30Updated 2 weeks ago
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆297Updated this week
- [CVPR2025] BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding☆31Updated 5 months ago
- ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies☆18Updated 3 months ago
- A paper list for spatial reasoning☆139Updated 3 months ago
- Survey: https://arxiv.org/pdf/2507.20198☆145Updated 2 weeks ago
- The official code for the paper: LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs☆106Updated 2 months ago
- Official repository for the UAE paper, unified-GRPO, and unified-Bench☆116Updated 2 weeks ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆217Updated last month
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆46Updated last month
- Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆69Updated 2 months ago
- [NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration☆82Updated 3 months ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆74Updated 2 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆148Updated last month
- The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"☆40Updated last month
- An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"☆34Updated this week
- 📖 This is a repository for organizing papers, codes, and other resources related to personalized video generation and editing.☆53Updated last week
- ☆50Updated last month
- [Neurips 2025 NextVid Workshop Oral✨] Official Implementation of VideoGen-of-Thought: Step-by-step generating multi-shot video with minim…☆39Updated this week
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆167Updated 2 weeks ago
- Collection of Highlight papers☆41Updated last year
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆117Updated 11 months ago
- [NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing☆96Updated last week