willccbb/trl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/willccbb/trl)

willccbb / trl

Train transformer language models with reinforcement learning.

☆19

Alternatives and similar repositories for trl

Users that are interested in trl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

amitlevy / evolutionaryGPT
View on GitHub
Evolutionary Search for expert-level performance on any task with environmental feedback
☆14Oct 12, 2025Updated 9 months ago
oaknational / oak-ai-autoeval-tools
View on GitHub
Oak National Academy's AI Auto Eval tools provide LLM as a judge evaluation on lesson plans and resources
☆17Jun 11, 2026Updated last month
rawsh / mirrorllm
View on GitHub
various experiments for scaling inference time compute with small reasoning models
☆17Jan 16, 2025Updated last year
openfeedback / superhf
View on GitHub
Open-source Human Feedback Library
☆11Oct 25, 2023Updated 2 years ago
kubernetes-bad / reward-composer
View on GitHub
Lego for GRPO
☆30May 27, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
animotionjs / examples
View on GitHub
🔥 Animotion examples
☆20Apr 3, 2025Updated last year
CharlieLeee / BIT-Report-LaTeX
View on GitHub
English and Chinese LaTeX template for reports/projects/proposal at Beijing Institute of Technology
☆10Nov 19, 2020Updated 5 years ago
abietti / transformer-birth
View on GitHub
☆19Dec 12, 2023Updated 2 years ago
kylehkhsu / tripod
View on GitHub
☆12Apr 19, 2024Updated 2 years ago
phonism / CP-Zero
View on GitHub
Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.
☆18Apr 22, 2025Updated last year
erwincoumans / ARS
View on GitHub
An implementation of the Augmented Random Search algorithm
☆14Jan 29, 2022Updated 4 years ago
thomasahle / cce
View on GitHub
Clustered Compositional Embeddings
☆13Oct 25, 2023Updated 2 years ago
spencerwooo / zan-chat
View on GitHub
A peer-to-peer communication system. BIT 小学期软件开发实训。
☆11Sep 7, 2018Updated 7 years ago
fs-c / royalroad-api
View on GitHub
An unofficial API for royalroad.com
☆23Jul 27, 2025Updated 11 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ruvnet / hacker-league
View on GitHub
☆33Feb 6, 2025Updated last year
code-forge-temple / scribe-pal
View on GitHub
ScribePal is an Open Source intelligent browser extension that leverages AI to empower your web experience by providing contextual insigh…
☆22Apr 6, 2026Updated 3 months ago
13point5 / swe-grep-oss
View on GitHub
An RL environment similar to Cognition's SWE-Grep
☆16Mar 10, 2026Updated 4 months ago
jwhj / OREO
View on GitHub
☆116Jan 21, 2025Updated last year
tengxiao1 / MR-Search
View on GitHub
Meta-Reinforcement Learning with Self-Reflection
☆33Mar 26, 2026Updated 3 months ago
willccbb / localchat
View on GitHub
☆13Apr 16, 2025Updated last year
AlexIoannides / llm-regression
View on GitHub
Exploring the classical regression capabilities of LLMs.
☆18May 20, 2024Updated 2 years ago
edenbiran / HoppingTooLate
View on GitHub
Exploring the Limitations of Large Language Models on Multi-Hop Queries
☆33Mar 2, 2025Updated last year
zaydzuhri / flame
View on GitHub
Fork of Flame repo for training of some new stuff in development
☆20Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Simon-12 / tidy-images
View on GitHub
Easily sort and organise your image collection.
☆10Jan 5, 2024Updated 2 years ago
cogtoolslab / block_construction
View on GitHub
Project investigating human physical construction behavior
☆13Oct 6, 2023Updated 2 years ago
Team254 / FRC-2019-Offseason-Public
View on GitHub
Code for an FRC swerve drive with Backlash's superstructure
☆12Dec 21, 2019Updated 6 years ago
AI4Finance-Foundation / FinGPT-Earnings-Call-LLM-Agent
View on GitHub
☆18Apr 1, 2024Updated 2 years ago
endomorphosis / ipfs_kit_py
View on GitHub
A decentralized virtual filesystem with low latency caching.
☆19Updated this week
alexzhang13 / longcot-mini-rlm-results
View on GitHub
Storing the LongCoT-mini results for RLM(GPT-5.2)
☆20Apr 26, 2026Updated 2 months ago
aurelienpierreeng / VirtualSecretary
View on GitHub
A Python framework to connect to email/contacts/agendas servers and write automated rules for efficient workflows.
☆14Jun 23, 2026Updated 3 weeks ago
grey-area / complex-power-tower
View on GitHub
☆14Oct 3, 2023Updated 2 years ago
generalroboticslab / TSIL
View on GitHub
Temporal Self-imitation Learning
☆15Jul 3, 2026Updated 2 weeks ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Airmomo / SmolDocling-256M-WebUI
View on GitHub
WebUI for using SmolDocling-256M-preview
☆14Mar 21, 2025Updated last year
google / drjax
View on GitHub
☆19Jul 8, 2026Updated last week
ashkangoleh / pyiceberg-lakehouse
View on GitHub
☆14May 17, 2025Updated last year
iubh / DLMDSDL01
View on GitHub
Deep Learning
☆16Aug 28, 2020Updated 5 years ago
kwaipilot / SWE-Compass
View on GitHub
☆18Mar 28, 2026Updated 3 months ago
UCSB-NLP-Chang / ThinkPrune
View on GitHub
☆46Sep 27, 2025Updated 9 months ago
0x404 / conventional-commit-classification
View on GitHub
A First Look at Conventional Commits Classification
☆16Nov 18, 2024Updated last year