fangyuan-ksgk/Tiny-GRPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fangyuan-ksgk/Tiny-GRPO)

fangyuan-ksgk / Tiny-GRPO

minimal GRPO implementation from scratch

☆104

Alternatives and similar repositories for Tiny-GRPO

Users that are interested in Tiny-GRPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

joey00072 / nanoGRPO
View on GitHub
nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)
☆143May 8, 2025Updated last year
AlphaLab-USTC / Must-Read-LLM-Papers
View on GitHub
☆19Sep 16, 2025Updated 10 months ago
lemon-prog123 / LongRePS
View on GitHub
Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision
☆19Apr 1, 2025Updated last year
haileyschoelkopf / triton-index
View on GitHub
See https://github.com/cuda-mode/triton-index/ instead!
☆11May 8, 2024Updated 2 years ago
gabrielcassimiro17 / async-langchain
View on GitHub
Demonstration of how to run multiple chains in Langchain Assyncronously
☆12Jul 6, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yixiaoer / tpu-training-example
View on GitHub
☆16Jul 8, 2024Updated 2 years ago
reka-ai / research-eval
View on GitHub
A benchmark to evaluate search-augmented LLMs
☆17Aug 28, 2025Updated 10 months ago
fangyuan-ksgk / Mini-LLaVA
View on GitHub
A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.
☆99Dec 17, 2024Updated last year
lemonade-sdk / peel
View on GitHub
Get aid from local LLMs right in your PowerShell
☆16May 2, 2025Updated last year
mkurman / grpo-llm-evaluator
View on GitHub
Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…
☆54May 7, 2025Updated last year
kirodaki / btc-python-utils
View on GitHub
Bitcoin utilities and protocol library for interacting with the network
☆15Oct 27, 2025Updated 8 months ago
fangyuan-ksgk / repo-viewer
View on GitHub
Visualize any repo or codebase into diagram or animation
☆24Oct 14, 2024Updated last year
lsdefine / simple_GRPO
View on GitHub
A very simple GRPO implement for reproducing r1-like LLM thinking.
☆1,699Nov 21, 2025Updated 8 months ago
SalesforceAIResearch / PretrainRL-pipeline
View on GitHub
An automated data pipeline scaling RL to pretraining levels
☆76Jun 2, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
uservan / ThinkPO
View on GitHub
☆17Aug 1, 2025Updated 11 months ago
seanzhang-zhichen / baichuan-Dynamic-NTK-ALiBi
View on GitHub
百川Dynamic NTK-ALiBi的代码实现：无需微调即可推理更长文本
☆49Aug 27, 2023Updated 2 years ago
ivanleomk / build-hackathon-rag-ws
View on GitHub
☆10Jun 8, 2024Updated 2 years ago
maum-ai / sane-tts
View on GitHub
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
☆11Jun 30, 2023Updated 3 years ago
icip-cas / SSO
View on GitHub
A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…
☆20Nov 21, 2024Updated last year
y-chan / hifi-gan-misrnet
View on GitHub
unofficial pytorch implementation of HiFi-GAN with fast MISR.
☆15Mar 21, 2023Updated 3 years ago
BY571 / SCoRe
View on GitHub
SCoRe: Training Language Models to Self-Correct via Reinforcement Learning
☆16May 14, 2026Updated 2 months ago
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
OpenVoiceOS / ovos-tts-server
View on GitHub
simple flask server to host OpenVoiceOS tts plugins as a service
☆16Updated this week
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
iesl / box-mlc-iclr-2022
View on GitHub
Official repository for the paper "Modeling Label Space Interactions in Multi-label Classification using Box Embeddings".
☆12Apr 25, 2022Updated 4 years ago
cocoa-org / NanoRollout
View on GitHub
Scale digital agent rollouts without pain.
☆34Jun 18, 2026Updated last month
wdlctc / mini-s
View on GitHub
☆51Oct 29, 2024Updated last year
Edresson / ZS-TTS-Evaluation
View on GitHub
☆45Sep 19, 2024Updated last year
yuancu / subgraph-retrieval-toolkit
View on GitHub
SRTK: Retrieve semantic-relevant subgraphs from large-scale knowledge graphs
☆32Sep 22, 2024Updated last year
gregretkowski / llmsec
View on GitHub
☆20Aug 12, 2024Updated last year
PINTO0309 / gazelle
View on GitHub
☆18Jul 8, 2026Updated 2 weeks ago
caleb-kan / AI-Research-Agent
View on GitHub
The AI Research Agent uses AI and web scraping to source, summarize, and cite information on any topic. It efficiently provides accurate…
☆16Nov 11, 2024Updated last year
Carol-gutianle / MEOW
View on GitHub
☆16May 16, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
hao-ai-lab / Dynasor
View on GitHub
[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.
☆232May 31, 2025Updated last year
robbiemu / llama-gguf-optimize
View on GitHub
Scripts and tools for optimizing quantizations in llama.cpp with GGUF imatrices.
☆19Jan 10, 2025Updated last year
sail-sg / understand-r1-zero
View on GitHub
Understanding R1-Zero-Like Training: A Critical Perspective
☆1,268Aug 27, 2025Updated 10 months ago
franciscoliu / SKU
View on GitHub
Official code implementation of SKU, Accepted by ACL 2024 Findings
☆20Dec 18, 2024Updated last year
mayhewsw / multilingual-t5
View on GitHub
☆12Dec 30, 2020Updated 5 years ago
astonishedrobo / tabulens
View on GitHub
🔍📃 LLM-powered PDF Table Extractor
☆19Jun 26, 2025Updated last year
jxmorris12 / cde
View on GitHub
code for training & evaluating Contextual Document Embedding models
☆207May 14, 2025Updated last year