ashworks1706/rlhf-from-scratch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ashworks1706/rlhf-from-scratch)

ashworks1706 / rlhf-from-scratch

A theoretical and practical deep dive into Reinforcement Learning with Human Feedback and it’s applications in Large Language Models from scratch.

☆115

Alternatives and similar repositories for rlhf-from-scratch

Users that are interested in rlhf-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

daisy-workflow / daisy-engine
View on GitHub
AI orchestration platform
☆83May 27, 2026Updated 2 months ago
ycunxi / ACEC_benchmarks
View on GitHub
Arithmetic multiplier benchmarks
☆12Nov 13, 2017Updated 8 years ago
verilog-to-routing / libblifparse
View on GitHub
Parsing library for BLIF netlists
☆19Nov 1, 2024Updated last year
oxidecomputer / rack-explorer
View on GitHub
☆27Jun 26, 2026Updated last month
robitec97 / gemma3.c
View on GitHub
Gemma 3 pure inference in C
☆113Apr 3, 2026Updated 3 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
chebykinn / browser-code
View on GitHub
Coding agent for user scripts
☆122Mar 16, 2026Updated 4 months ago
nauotit / openf4
View on GitHub
F4 algorithm C++ library (groebner basis computations over finite fields)
☆14Apr 6, 2018Updated 8 years ago
ShurikenTrade / shuriken-skills
View on GitHub
Agent-consumable integration skills for the Shuriken platform 🥷
☆90May 10, 2026Updated 2 months ago
noway / yagrad
View on GitHub
yet another scalar autograd engine - featuring complex numbers and fixed DAG
☆26Mar 20, 2024Updated 2 years ago
seelikat / neuro-visual-reconstruction-dataset-index
View on GitHub
Index and overview of neuroimaging datasets for visual perception reconstruction.
☆72Mar 13, 2026Updated 4 months ago
accretional / semantifly
View on GitHub
☆15Sep 6, 2024Updated last year
Dobiasd / treebomination
View on GitHub
convert a scikit-learn decision tree into a Keras model
☆39Oct 21, 2023Updated 2 years ago
PhillipKerger / zero-order-bounds-lean-verification
View on GitHub
☆61Jul 17, 2026Updated last week
ash80 / RLHF_in_notebooks
View on GitHub
RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks
☆253Jun 20, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
nolasoft / okgit
View on GitHub
☆101Mar 18, 2026Updated 4 months ago
anishathalye / semlib
View on GitHub
Build data processing and data analysis pipelines that leverage the power of LLMs 🧠
☆261Updated this week
ivanbelenky / RL
View on GitHub
R.L. methods and techniques.
☆197Updated this week
alshdavid-public / stronk
View on GitHub
Open Source Fitness Tracking App
☆34May 20, 2026Updated 2 months ago
nestordemeure / shelly
View on GitHub
An LLM based shell assistant that knows your usual shell commands.
☆17Jul 18, 2025Updated last year
Dicklesworthstone / introduction_to_temporal_logic
View on GitHub
An introduction to temporal logic and how it can be used to analyze concurrency
☆111Jan 24, 2024Updated 2 years ago
jkone27 / feliz-vite
View on GitHub
feliz react template: using F# , fable vite plugin, vite and vitest, an alternative to typescript?
☆10Jun 27, 2025Updated last year
myers / neuroevolution_in_elixir
View on GitHub
Code from Handbook of Neuroevolution Through Erlang translated to Elixir
☆12Nov 26, 2017Updated 8 years ago
paven / birGit
View on GitHub
Git helper
☆13Mar 27, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
tyfeld / MMaDA-Parallel
View on GitHub
Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"
☆302Jan 29, 2026Updated 6 months ago
fadoss / umaudemc
View on GitHub
Unified Maude model-checking tool
☆13Jul 13, 2026Updated 2 weeks ago
markusheimerl / gpt
View on GitHub
A generative pretrained transformer implementation
☆93Jun 29, 2026Updated last month
coco-team / lustrec
View on GitHub
A modular Lustre to C / Horn clauses compiler
☆22Nov 17, 2018Updated 7 years ago
TrevorS / voxtral-mini-realtime-rs
View on GitHub
Voxtral ASR & TTS running natively and in the browser. A Rust implementation of Mistral's Voxtral mini realtime ASR / TTS using the Burn …
☆811Apr 2, 2026Updated 3 months ago
cairnc / sat_blog
View on GitHub
Code to go along with Separating Axis Test blog
☆62Jul 21, 2026Updated last week
Agentic-Systems-Lab / rigorous
View on GitHub
A comprehensive suite of tools, built to liberate science by making the creation, evaluation, and dissemination of research more transpar…
☆251Aug 8, 2025Updated 11 months ago
sweetlilmre / otla
View on GitHub
Automatically exported from code.google.com/p/otla
☆11Apr 3, 2015Updated 11 years ago
danjulio / rocketblue-automation
View on GitHub
Utilities, documentation and example code for Solar Pi Platter
☆11Aug 28, 2022Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
JustVugg / nanoeuler
View on GitHub
GPT-2-style LLM built from scratch in C/CUDA with hand-written backprop, BPE tokenizer, FlashAttention, pretraining, and SFT.
☆108Jun 18, 2026Updated last month
SunAndClouds / ReadMe
View on GitHub
Turn your files into a memory filesystem for AI agents.
☆43Apr 15, 2026Updated 3 months ago
xavier-yu114 / Zoom-Refine
View on GitHub
Zoom-Refine: Boosting High-Resolution Multimodal Understanding via Localized Zoom and Self-Refinement
☆19Jul 4, 2026Updated 3 weeks ago
scalesimple / vmods
View on GitHub
vmods needed
☆16Jan 15, 2014Updated 12 years ago
CachyOS / cachyos-qtile-settings
View on GitHub
Settings used for CachyOS Qtile
☆15Sep 29, 2025Updated 10 months ago
asg017 / sqlite-vector
View on GitHub
A SQLite extension for working with float and binary vectors. Work in progress!
☆24Feb 10, 2023Updated 3 years ago
smerrill / vcl-vim-plugin
View on GitHub
A VCL highlighting plugin for vim.
☆18Apr 27, 2014Updated 12 years ago