sanbuphy / computer-vision-referenceLinks
Collected the world's best computer vision labs and lecture materials.
☆14Updated 5 months ago
Alternatives and similar repositories for computer-vision-reference
Users that are interested in computer-vision-reference are comparing it to the libraries listed below
Sorting:
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆77Updated 2 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆138Updated 4 months ago
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆54Updated 7 months ago
- [Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics]: VisuoThink: Empowering LVLM Reasoning with Mul…☆27Updated 2 weeks ago
- Survey: https://arxiv.org/pdf/2507.20198☆69Updated this week
- paper list, tutorial, and nano code snippet for Diffusion Large Language Models.☆96Updated last month
- ☆50Updated last month
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆105Updated 2 months ago
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆132Updated this week
- One-shot Entropy Minimization☆175Updated last month
- A Self-Training Framework for Vision-Language Reasoning☆80Updated 6 months ago
- [arXiv2505] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains☆34Updated last week
- ☆26Updated last month
- ☆194Updated this week
- ☆62Updated last week
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning"☆134Updated 2 months ago
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆83Updated 2 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆68Updated 2 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆86Updated 2 months ago
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆98Updated 8 months ago
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆35Updated last month
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆188Updated 4 months ago
- A Collection of Papers on Diffusion Language Models☆98Updated this week
- repo for paper https://arxiv.org/abs/2504.13837☆180Updated last month
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)☆17Updated last month
- MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision☆24Updated 2 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆81Updated 5 months ago
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆308Updated 2 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 6 months ago
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆20Updated 5 months ago