WeijieMax / EyeRealLinks
Offcial Code of EyeReal
β77Updated this week
Alternatives and similar repositories for EyeReal
Users that are interested in EyeReal are comparing it to the libraries listed below
Sorting:
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectoriesβ81Updated 4 months ago
- π This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.β347Updated last week
- A vue-based project page template for academic papers. (in development) https://junyaohu.github.io/academic-project-page-template-vueβ300Updated 4 months ago
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understandingβ466Updated 3 weeks ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Giveβ¦β186Updated last month
- A collection of vision foundation models unifying understanding and generation.β59Updated 11 months ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"β200Updated 4 months ago
- Official respository for ReasonGen-R1β73Updated 5 months ago
- [NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaborationβ96Updated this week
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoTβ104Updated last month
- [CVPRW 2025] UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inpuβ¦β97Updated 7 months ago
- Cambrian-S: Towards Spatial Supersensing in Videoβ407Updated 3 weeks ago
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.β71Updated 2 weeks ago
- Implements VAR+CLIP for text-to-image (T2I) generationβ146Updated 10 months ago
- Collection of Highlight papersβ42Updated last year
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"β421Updated 5 months ago
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaborationβ24Updated last year
- β112Updated this week
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generationβ178Updated 6 months ago
- Official PyTorch implementation of FlowMo.β103Updated 8 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generationβ234Updated 3 months ago
- Pixel-Level Reasoning Model trained with RL [NeuIPS25]β254Updated last month
- A paper list for spatial reasoningβ471Updated last week
- Official Repo of From Masks to Worlds: A Hitchhikerβs Guide to World Models.β57Updated last month
- Official code for NeurIPS 2025 paper "GRIT: Teaching MLLMs to Think with Images"β164Updated this week
- [NeurIPS 2025 Oral] Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Thinkβ198Updated 2 months ago
- A list of works on video generation towards world modelβ225Updated last week
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Visionβ177Updated 2 weeks ago
- β163Updated 5 months ago
- Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawingβ82Updated 4 months ago