kubernetes-bad/reward-composer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kubernetes-bad/reward-composer)

kubernetes-bad / reward-composer

Lego for GRPO

☆30

Alternatives and similar repositories for reward-composer

Users that are interested in reward-composer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenPipe / deductive-reasoning
View on GitHub
Train your own SOTA deductive reasoning model
☆111Mar 6, 2025Updated last year
JacksonCakes / vision-r1
View on GitHub
☆13Mar 23, 2025Updated last year
The-Chaotic-Neutrals / ShareGPT-Formaxxing
View on GitHub
☆14Jul 7, 2026Updated 3 weeks ago
Improbable-AI / orso
View on GitHub
☆18Feb 22, 2025Updated last year
ahhcash / ghastly
View on GitHub
a key value based vector db!
☆19May 3, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
osoleve / glitchlings
View on GitHub
Enemies for your LLM
☆38Jan 20, 2026Updated 6 months ago
axolotl-ai-cloud / grpo_code
View on GitHub
A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.
☆41Apr 4, 2025Updated last year
NathanGodey / qfilters
View on GitHub
Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)
☆34Mar 7, 2025Updated last year
Digitous / ModelREVOLVER
View on GitHub
Model REVOLVER, a human in the loop model mixing system.
☆33Aug 2, 2023Updated 2 years ago
MaximeRivest / ovllm
View on GitHub
☆39Aug 4, 2025Updated 11 months ago
elsatch / daily_hf_papers_abstracts
View on GitHub
This repository includes the code to download the curated HuggingFace papers into a single markdown formatted file
☆16Jul 26, 2024Updated 2 years ago
Percent-BFD / neurips_submission
View on GitHub
☆17Nov 23, 2023Updated 2 years ago
choosewhatulike / case2code
View on GitHub
☆17Apr 7, 2025Updated last year
willccbb / localchat
View on GitHub
☆13Apr 16, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
unbalancedparentheses / learning_systems_and_security
View on GitHub
I like to learn new things
☆12Feb 28, 2026Updated 5 months ago
webup / langgraph-exercise-book
View on GitHub
Samples and demos with LangGraph and LangChain frameworks.
☆16Aug 22, 2025Updated 11 months ago
fargolo / TextGraphs.jl
View on GitHub
Graph representations of text
☆13Sep 20, 2023Updated 2 years ago
collinear-ai / spider
View on GitHub
Streamline on-policy/off-policy distillation workflows in a few lines of code
☆109Updated this week
baggepinnen / LengthChannels.jl
View on GitHub
Julia Channels with defined length: Buffered and threaded iterators for machine learning.
☆12Dec 13, 2020Updated 5 years ago
antoniopurificato / Sheaf4Rec
View on GitHub
☆14Mar 4, 2024Updated 2 years ago
s2kdesign-com / CoinGardenWorld
View on GitHub
Web3 Infrastructure for gardening, growing, selling and earning crypto from your flowers.
☆12Jan 28, 2026Updated 6 months ago
baggepinnen / DiskDataProviders.jl
View on GitHub
Disk based, buffered data structures for machine learning
☆12Jan 4, 2021Updated 5 years ago
cpldcpu / llmbenchmark
View on GitHub
Various LLM Benchmarks
☆26Feb 20, 2026Updated 5 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
stockeh / mlx-grokking
View on GitHub
Grokking on modular arithmetic in less than 150 epochs in MLX
☆15Oct 24, 2024Updated last year
jacobwarren / Latent-Space-Verification-for-Self-Correcting-LLMs
View on GitHub
☆17Mar 28, 2025Updated last year
ivanleomk / modal-grpo
View on GitHub
☆19Mar 16, 2025Updated last year
willccbb / trl
View on GitHub
Train transformer language models with reinforcement learning.
☆19Feb 25, 2025Updated last year
AlpinDale / sillytui
View on GitHub
LLM RP TUI for Power Users.
☆35Jan 13, 2026Updated 6 months ago
CoderPat / croissant-llm-training
View on GitHub
Repository containing the code for training the CroissantLLM
☆21Feb 4, 2024Updated 2 years ago
SuperagenticAI / Agentic-DevOps
View on GitHub
A comprehensive demonstration of Agentic DevOps using DSPy and Model Context Protocol (MCP)
☆16May 31, 2025Updated last year
shivarama23 / LayoutLMV3
View on GitHub
This repo consists of the code as discussed in the Medium blog.
☆17Sep 10, 2023Updated 2 years ago
blogresponder / Realtek-rtkio64-Windows-driver-privilege-escalation
View on GitHub
A PoC of a privilege escalation vulnerability in the Realtek rtkio64 Windows driver.
☆21Jul 6, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
bradhilton / temporal-clue
View on GitHub
Clue inspired puzzles for testing LLM deduction abilities
☆47Mar 19, 2026Updated 4 months ago
bethgelab / delta-belief-rl
View on GitHub
Official implementation of the ΔBelief-RL method.
☆31Feb 28, 2026Updated 5 months ago
nmheim / NeuralArithmetic.jl
View on GitHub
Collection of layers that can perform arithmetic operations
☆12Aug 17, 2021Updated 4 years ago
oxinabox / MixedModeDebugger.jl
View on GitHub
A Julia Debugger that works with mixed compiled and interpretted mode for performance
☆16Feb 8, 2020Updated 6 years ago
openfeedback / superhf
View on GitHub
Open-source Human Feedback Library
☆11Oct 25, 2023Updated 2 years ago
Arize-ai / LLMTest_NeedleInAHaystack
View on GitHub
Doing simple retrieval from LLM models at various context lengths to measure accuracy
☆110Sep 19, 2025Updated 10 months ago
whetstoneresearch / doppler-sdk-legacy
View on GitHub
☆22Oct 19, 2025Updated 9 months ago