AuroraZengfh / RobustMergeLinks
[NeurIPS'25 Spotlightπ₯] Official Implementation of RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
β57Updated last month
Alternatives and similar repositories for RobustMerge
Users that are interested in RobustMerge are comparing it to the libraries listed below
Sorting:
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvementβ129Updated 6 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]β182Updated 8 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)β88Updated 4 months ago
- β110Updated last year
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentationβ104Updated 4 months ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Modelsβ53Updated 4 months ago
- [NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPOβ78Updated 3 months ago
- Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"β125Updated last week
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*β109Updated 8 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Modelsβ84Updated 3 months ago
- β68Updated 4 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.β69Updated last year
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.β179Updated 4 months ago
- Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learningβ28Updated last year
- β24Updated 8 months ago
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Modelsβ149Updated 4 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.β73Updated last year
- A Self-Training Framework for Vision-Language Reasoningβ88Updated last year
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"β37Updated last year
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Modelsβ94Updated last year
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"β95Updated 2 months ago
- The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learnβ¦β40Updated last month
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architectureβ213Updated last year
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]β38Updated last week
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuningβ90Updated last year
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."β147Updated 6 months ago
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202β¦β40Updated 8 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.β85Updated last year
- Reinforcement Learning of Vision Language Models with Self Visual Perception Rewardβ160Updated 4 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."β52Updated last year