WeijieMax / EyeRealLinks
Offcial Code of EyeReal
β87Updated 3 weeks ago
Alternatives and similar repositories for EyeReal
Users that are interested in EyeReal are comparing it to the libraries listed below
Sorting:
- A paper list for spatial reasoningβ550Updated this week
- π This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.β370Updated this week
- A vue-based project page template for academic papers. (in development) https://junyaohu.github.io/academic-project-page-template-vueβ307Updated 5 months ago
- [NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaborationβ104Updated 3 weeks ago
- Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?β128Updated last week
- Towards Scalable Pre-training of Visual Tokenizers for Generationβ357Updated last week
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, β¦β196Updated 7 months ago
- A list of works on video generation towards world modelβ281Updated last week
- Cambrian-S: Towards Spatial Supersensing in Videoβ442Updated this week
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectoriesβ88Updated 4 months ago
- Collection of Highlight papersβ42Updated last year
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Modelsβ164Updated 2 months ago
- A collection of vision foundation models unifying understanding and generation.β59Updated 11 months ago
- Official PyTorch implementation of FlowMo.β105Updated 8 months ago
- Simulating the Real World: Survey & Resources, which contains our survey "Simulating the Real World: A Unified Survey of Multimodal Generβ¦β316Updated last week
- Official respository for ReasonGen-R1β73Updated 6 months ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Giveβ¦β196Updated 2 months ago
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoningβ100Updated 5 months ago
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoTβ110Updated last month
- [CVPR 2025] EgoLife: Towards Egocentric Life Assistantβ368Updated 9 months ago
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPOβ76Updated last month
- [ICLR'25] Reconstructive Visual Instruction Tuningβ133Updated 8 months ago
- [CVPR'2022, TPAMI'2024] LAVT: Language-Aware Vision Transformer for Referring Segmentationβ24Updated 11 months ago
- β167Updated 6 months ago
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understandingβ491Updated last month
- Official code for MotionBench (CVPR 2025)β61Updated 9 months ago
- PyTorch implementation of NEPAβ196Updated this week
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generationβ235Updated 4 months ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"β202Updated 5 months ago
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.β73Updated this week