hmwang2002 / InternSVGLinks
Official repository of InternSVG.
☆87Updated 2 months ago
Alternatives and similar repositories for InternSVG
Users that are interested in InternSVG are comparing it to the libraries listed below
Sorting:
- Data and Code for CVPR 2025 paper "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"☆77Updated 10 months ago
- ☆33Updated 4 months ago
- [ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding☆95Updated 9 months ago
- [AAAI-26] Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?☆26Updated 3 weeks ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 11 months ago
- ☆17Updated last year
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆52Updated 3 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆174Updated 7 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109Updated 7 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last month
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆126Updated 5 months ago
- ☆62Updated 4 months ago
- SFT+RL boosts multimodal reasoning☆41Updated 6 months ago
- ☆162Updated last month
- ☆23Updated 7 months ago
- ☆39Updated 7 months ago
- Official repository of MMDU dataset☆102Updated last year
- A Self-Training Framework for Vision-Language Reasoning☆88Updated 11 months ago
- MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆212Updated 3 months ago
- MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision☆27Updated 7 months ago
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆94Updated 7 months ago
- [ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching i…☆44Updated 6 months ago
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step☆158Updated 5 months ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆36Updated 4 months ago
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆57Updated last year
- [CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents☆54Updated 7 months ago
- [NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent☆36Updated last month
- ☆90Updated last year
- [ACL'25 Main] ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation☆74Updated last month
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆85Updated 11 months ago