An efficient GRPO training util.
☆54Jun 13, 2025Updated 8 months ago
Alternatives and similar repositories for PrefixGrouper
Users that are interested in PrefixGrouper are comparing it to the libraries listed below
Sorting:
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆13Jan 16, 2026Updated last month
- Analog IC symmetry extraction benchmark of AncstrGNN☆10Aug 19, 2024Updated last year
- Source code for paper "Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration" of NeurIPS 2019☆10Jan 25, 2024Updated 2 years ago
- grpo to train long form QA and instructions with long-form reward model☆17Jul 17, 2025Updated 7 months ago
- ☆10Jun 28, 2025Updated 8 months ago
- ☆26Jul 29, 2025Updated 7 months ago
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Sep 14, 2025Updated 5 months ago
- Demos of neural image editing☆11Mar 15, 2021Updated 4 years ago
- A user-friendly interface built on top of Thinking Machines Tinker API that lets you fine-tune LLMs, chat with your trained model, and de…☆27Jan 31, 2026Updated last month
- Rethinking the Trust Region in LLM Reinforcement Learning☆39Feb 25, 2026Updated last week
- Using DTensor on Google Cloud☆18Sep 18, 2022Updated 3 years ago
- Official implementation for the paper "Quantum Bayesian Optimization" accepted to NeurIPS 2023.☆12Jan 7, 2024Updated 2 years ago
- ☆13Aug 26, 2024Updated last year
- EMIT: Enhancing MLLMs for Industrial Anomaly Detection via Difficulty-Aware GRPO☆20Jan 24, 2026Updated last month
- ☆15Jul 13, 2025Updated 7 months ago
- Task Planning and Tracking toolset for Pydantic AI agents, enabling hierarchical task management with subtasks, PostgreSQL storage for mu…☆42Updated this week
- 微聚,专业的数据标注,采集平台☆13Jun 19, 2018Updated 7 years ago
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆18Apr 22, 2025Updated 10 months ago
- The github repository of ChatLM.☆10Mar 16, 2024Updated last year
- A react-typescript component for Plotly.JS graphs.☆15Feb 29, 2020Updated 6 years ago
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆17Aug 30, 2024Updated last year
- Causal tracing for language models☆12Apr 2, 2024Updated last year
- An approach for Circuit Synthesis using Dataset Threshold queries.☆14May 28, 2023Updated 2 years ago
- Software XY oscilloscope written in pure Rust, intended as an audio visualizer☆17Aug 10, 2024Updated last year
- ☆13Jan 15, 2025Updated last year
- ICLR Blog Track 2025☆19Sep 21, 2025Updated 5 months ago
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆21Oct 16, 2025Updated 4 months ago
- [COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models☆18Jan 18, 2025Updated last year
- Med-DANet Series (ECCV 2022 & WACV 2024)☆13Jan 2, 2024Updated 2 years ago
- Multimodal preprocessing on IEMOCAP dataset☆13Jun 8, 2018Updated 7 years ago
- Yaksa: High-performance Noncontiguous Data Management☆15Oct 1, 2025Updated 5 months ago
- ☆15Feb 23, 2026Updated last week
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Oct 11, 2025Updated 4 months ago
- Companion code to https://arxiv.org/abs/2409.03797v2☆19Sep 18, 2025Updated 5 months ago
- Steering LLM Thinking with Budget Guidance☆27Feb 19, 2026Updated 2 weeks ago
- The official implementation of the paper “Anchored Supervised Fine-Tuning”☆30Feb 12, 2026Updated 2 weeks ago
- Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.☆21Jul 18, 2025Updated 7 months ago
- Multi-agent coordination for Pi - presence, messaging, file reservations☆53Updated this week
- ☆17Apr 23, 2025Updated 10 months ago