Lego for GRPO
☆30May 27, 2025Updated 11 months ago
Alternatives and similar repositories for reward-composer
Users that are interested in reward-composer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Feb 22, 2025Updated last year
- Train your own SOTA deductive reasoning model☆110Mar 6, 2025Updated last year
- L3 R3: AGM RISC-V +CPLD/FPGA MCU (AG32VH407/AG32VF407/AG32VF303)☆13Nov 3, 2024Updated last year
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆29Mar 1, 2025Updated last year
- Enemies for your LLM☆35Jan 20, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- a key value based vector db!☆19May 3, 2025Updated last year
- This repository includes the code to download the curated HuggingFace papers into a single markdown formatted file☆16Jul 26, 2024Updated last year
- ☆16Nov 23, 2023Updated 2 years ago
- ☆17Apr 7, 2025Updated last year
- Discrete event simulator built in Rust 🦀☆13Jun 10, 2023Updated 2 years ago
- This unique variation on Thinking Claude maps Claude's thought process steps to unicode and forces Claude to think in unicode, potentiall…☆17Feb 24, 2025Updated last year
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆116Jun 9, 2025Updated 11 months ago
- ☆28Aug 27, 2025Updated 8 months ago
- ☆12Mar 4, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- RLVR Testing and Training☆23Aug 28, 2025Updated 8 months ago
- Various LLM Benchmarks☆25Feb 20, 2026Updated 2 months ago
- Simple GRPO scripts and configurations.☆59Feb 6, 2025Updated last year
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 4 months ago
- Grokking on modular arithmetic in less than 150 epochs in MLX☆15Oct 24, 2024Updated last year
- Train transformer language models with reinforcement learning.☆19Feb 25, 2025Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆91Mar 18, 2025Updated last year
- ☆19Mar 16, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Repository containing the code for training the CroissantLLM☆21Feb 4, 2024Updated 2 years ago
- A PoC of a privilege escalation vulnerability in the Realtek rtkio64 Windows driver.☆20Jul 6, 2020Updated 5 years ago
- Clue inspired puzzles for testing LLM deduction abilities☆47Mar 19, 2026Updated last month
- Mythic☆24Oct 15, 2024Updated last year
- ☆30Apr 6, 2026Updated last month
- Auxiliary tasks for task-oriented dialogue systems. Published in ICNLSP'22 and indexed in the ACL Anthology.☆17Feb 27, 2023Updated 3 years ago
- Instant Neural Graphics Primitives from scratch, zero dependencies. Learning by doing.☆10Aug 18, 2023Updated 2 years ago
- Example AI chat UI built with Cloudflare Workers, Vercel AI SDK, and Shadcn☆21Apr 29, 2025Updated last year
- ☆15Apr 26, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- An implementation of the Augmented Random Search algorithm☆14Jan 29, 2022Updated 4 years ago
- 🔎 Highly optimized Solidity library of statistical functions rationally approximated☆27Oct 15, 2024Updated last year
- Simple and efficient DeepSeek V3 SFT using pipeline parallel and expert parallel, with both FP8 and BF16 trainings☆117Jul 27, 2025Updated 9 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆111Mar 7, 2025Updated last year
- The original Shared Recurrent Memory Transformer implementation☆36Jul 11, 2025Updated 9 months ago
- Clustered Compositional Embeddings☆13Oct 25, 2023Updated 2 years ago
- NSA Triton Kernels written with GPT5 and Opus 4.1☆70Aug 12, 2025Updated 8 months ago