cnsdqd-dyb / Guide-GRPO
Aims for memory-efficient training (24GB VRAM) on consumer GPUs. Optimizing language models through guidance tokens in reasoning chains, based on DeepSeekRL-Extended.
☆30Updated 2 months ago
Alternatives and similar repositories for Guide-GRPO
Users that are interested in Guide-GRPO are comparing it to the libraries listed below
Sorting:
- GPT as a Monte Carlo Language Tree: A Probabilistic Perspective☆44Updated 4 months ago
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆71Updated 8 months ago
- ☆17Updated 6 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆86Updated last month
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆63Updated 2 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆35Updated 3 months ago
- MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension☆44Updated 5 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆43Updated 2 months ago
- ☆97Updated last month
- MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale☆43Updated 5 months ago
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆43Updated 3 months ago
- ICLR2024 statistics☆47Updated last year
- Assessing Context-Aware Creative Intelligence in MLLMs☆17Updated last month
- [ICLR'25] PiCO: Peer Review in LLMs based on the Consistency Optimization, https://arxiv.org/pdf/2402.01830☆36Updated 3 months ago
- [ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding☆78Updated last month
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆67Updated 3 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆34Updated last month
- [NIPS2023]Implementation of Foundation Model is Efficient Multimodal Multitask Model Selector☆36Updated last year
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆46Updated 5 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated 10 months ago
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆137Updated 6 months ago
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆54Updated last week
- Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning☆15Updated 6 months ago
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆82Updated 6 months ago
- ☆63Updated last week
- Recent Advances on MLLM's Reasoning Ability☆25Updated last month
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆43Updated 2 weeks ago
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆51Updated 6 months ago
- ☆75Updated 4 months ago
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆34Updated 4 months ago