tongruiliu / Guided-GRPOView external linksLinks
A Guided Reinforcement Learning framework enhancing MLLM reasoning via process-level verification and collaborative rollout strategies.
☆41Updated this week
Alternatives and similar repositories for Guided-GRPO
Users that are interested in Guided-GRPO are comparing it to the libraries listed below
Sorting:
- Flow-Modulated Scoring for Semantic-Aware Knowledge Graph Completion.☆18Updated this week
- 🤖Auto Tutor: 一键批量发送套磁邮件☆88Jan 18, 2026Updated 3 weeks ago
- ☆34Dec 1, 2025Updated 2 months ago
- 【ICME2025 Oral】Offical Pytorch Code for "Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition"☆11Mar 21, 2025Updated 10 months ago
- ☆31May 14, 2025Updated 9 months ago
- CB513 datasets for Protein Secondary Structure Prediction☆12Apr 3, 2025Updated 10 months ago
- 【ICME2025 Oral】 Offical Pytorch Code for "Learning Dual-Domain Multi-Scale Representations for Single Image Deraining"☆16Mar 21, 2025Updated 10 months ago
- 复现 Soft-Masked BERT, 论文 Spelling Error Correction with Soft-Masked BERT☆12Oct 14, 2020Updated 5 years ago
- 复现论文:TRANSFORMER-BASED MULTIMODAL FUSION FOR EARLY DIAGNOSIS OF ALZHEIMER’S DISEASE USING STRUCTURAL MRI AND PET☆12Jan 3, 2024Updated 2 years ago
- ArXiv daily dump and viewer using GitHub Actions - luvata.github.io/arxive☆14Updated this week
- 新闻网站静态页面,风格清新,新闻类。Html、Css、Js、Jquery、Ajax、Slider☆15Nov 22, 2022Updated 3 years ago
- 自动化登陆大连理工大学统一认证系统和 webvpn 系统☆16Sep 17, 2022Updated 3 years ago
- 复现论文《Distilling Task-Specific Knowledge from BERT into Simple Neural Networks》☆16Jun 13, 2021Updated 4 years ago
- Multi-label subcellular localization and sorting signal prediction based on protein foundation models☆22Jan 12, 2026Updated last month
- 大连理工大学图书馆自动预约座位小程序 | A tool for DLUT students to automatically reserve library.☆15Nov 16, 2023Updated 2 years ago
- Echos is a headless, API-driven DAW engine. It’s the backend for building AI tools that automate the entire music production lifecycle.☆55Nov 10, 2025Updated 3 months ago
- [CVPR 2025] Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts☆22Jun 22, 2025Updated 7 months ago
- Whenever the ultrasonic sensor detects any obstacle led,buzzer,camera turns on...and with tensorflow object detection ,the detected objec…☆26Apr 24, 2020Updated 5 years ago
- Integrating Large Weather Models with Data Assimilation☆22Jun 2, 2024Updated last year
- We used a web scraper to obtain all the papers from ECCV that have not yet been officially announced, making them available for those who…☆24Sep 2, 2024Updated last year
- FedBCGD☆53Nov 27, 2025Updated 2 months ago
- ☆65Nov 27, 2025Updated 2 months ago
- ☆26May 18, 2024Updated last year
- Multi-Attentional Deepfake Detection☆23Nov 15, 2024Updated last year
- 实训项目 SSM + Maven + Bootstrap 实现新闻网站(包括前台后台) 纯html 前后端分离☆21Jul 12, 2019Updated 6 years ago
- 《强化学习中的数学原理》笔记-个人学习的思考和补充☆77Nov 19, 2025Updated 2 months ago
- A case study on Pfam dataset to classify protein families.☆32Oct 10, 2019Updated 6 years ago
- 学习小土堆的视频,视频链接https://www.bilibili.com/video/BV11P411j7bn?share_source=copy_web☆37Aug 24, 2022Updated 3 years ago
- [CVPR2025] ProxyTransformation : Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding☆48Sep 2, 2025Updated 5 months ago
- [NeurIPS'24 spotlight] MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning. [TPAMI'25] MECD+☆45Updated this week
- 中科大先研计专软工部分导师和培养单元信息☆46Mar 1, 2025Updated 11 months ago
- ☆46Mar 22, 2024Updated last year
- VAEs and nonlinear ICA: a unifying framework☆40Jun 16, 2020Updated 5 years ago
- 【ACM MM 2025】PyTorch code for our paper "Cross Paradigm Representation and Alignment Transformer for Image Deraining"☆69Dec 19, 2025Updated last month
- List of all the project tutorials from Web Cifar YouTube Channel ✨☆52Nov 29, 2022Updated 3 years ago
- Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement (AAAI2020)☆46Dec 8, 2022Updated 3 years ago
- 提供开箱即用的数据库大作业前端模版(vue版本)☆44Dec 13, 2022Updated 3 years ago
- Code for "SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism"☆62May 7, 2021Updated 4 years ago
- 大连理工大学图书馆自动预约座位小程序☆53Mar 25, 2022Updated 3 years ago