sanbuphy / computer-vision-referenceLinks
Collected the world's best computer vision labs and lecture materials.
☆14Updated 6 months ago
Alternatives and similar repositories for computer-vision-reference
Users that are interested in computer-vision-reference are comparing it to the libraries listed below
Sorting:
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆78Updated 3 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆138Updated 4 months ago
- ☆20Updated 3 months ago
- paper list, tutorial, and nano code snippet for Diffusion Large Language Models.☆108Updated 2 months ago
- A Collection of Papers on Diffusion Language Models☆113Updated last week
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆105Updated 3 months ago
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆55Updated 8 months ago
- ☆26Updated 2 weeks ago
- [Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics]: VisuoThink: Empowering LVLM Reasoning with Mul…☆29Updated last month
- MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision☆25Updated 3 months ago
- Recent Advances on MLLM's Reasoning Ability☆25Updated 4 months ago
- One-shot Entropy Minimization☆180Updated 2 months ago
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆85Updated 2 weeks ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆72Updated 2 months ago
- ☆214Updated 2 weeks ago
- Official implementation of "Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology"☆60Updated last month
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning"☆142Updated 2 months ago
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆23Updated last month
- A collection of papers on discrete diffusion models☆158Updated 2 months ago
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆143Updated 3 weeks ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆84Updated 6 months ago
- ☆52Updated 2 months ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning", https://arxiv.org/abs/2505.13934☆79Updated 2 months ago
- ☆53Updated 3 weeks ago
- ☆67Updated 3 weeks ago
- MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆41Updated 4 months ago
- ☆57Updated 3 months ago
- Data and Code for CVPR 2025 paper "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"☆70Updated 6 months ago
- ☆87Updated last month
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆63Updated this week