ash80 / RLHF_in_notebooksView external linksLinks
RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks
☆231Jun 20, 2025Updated 7 months ago
Alternatives and similar repositories for RLHF_in_notebooks
Users that are interested in RLHF_in_notebooks are comparing it to the libraries listed below
Sorting:
- 🧠 Web AI / LLM in browser / Whisper in browser / WebGPU inference Examples☆27Oct 1, 2025Updated 4 months ago
- ☆10Jan 23, 2025Updated last year
- ☆13Aug 12, 2024Updated last year
- A Python-based AI coding assistant that uses the Gemini API for code generation, file manipulation, and interactive software development …☆22Jun 28, 2025Updated 7 months ago
- Plugin Marketplace for Claude Code☆20Updated this week
- ☆19Nov 9, 2024Updated last year
- A tool to sync source directories and files to a destination directory☆19Jan 14, 2024Updated 2 years ago
- Search, browse, and resume your Claude Code sessions. Fast.☆41Feb 2, 2026Updated last week
- A transformer-based multimodal model for music.☆29Aug 15, 2024Updated last year
- A MCP Server to Create MCP Server☆21Mar 4, 2025Updated 11 months ago
- 一个将视频转换为PPT的桌面应用。☆23Nov 14, 2021Updated 4 years ago
- 🔥 LitLytics - an affordable, simple analytics platform that leverages LLMs to automate data analysis☆103Nov 25, 2024Updated last year
- An MCP server for playing Minesweeper☆108Mar 20, 2025Updated 10 months ago
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Aug 3, 2024Updated last year
- ☆27Jan 3, 2024Updated 2 years ago
- TideCloak lets your users hold their own digital authority—no central control, no blind trust.☆64Jul 28, 2025Updated 6 months ago
- Implementation of all RL algorithms in a simpler way☆1,393Aug 29, 2025Updated 5 months ago
- A Python tool to parse PDF statements from Poste Italiane (Postepay, BancoPosta) and extract data as structured JSON.☆50Jul 25, 2025Updated 6 months ago
- ☆54Nov 14, 2024Updated last year
- Proof of concept for a VPN over UDP☆113Feb 3, 2026Updated last week
- ☆34Jan 22, 2026Updated 3 weeks ago
- A Survey Analyzing Generalization in Deep Reinforcement Learning☆36Oct 31, 2024Updated last year
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated last year
- ☆263Mar 27, 2024Updated last year
- A Streamlit app for generating high-quality Q&A training datasets from text and PDFs, leveraging Gemini, Claude, and OpenAI for LLM fine-…☆39Jul 5, 2025Updated 7 months ago
- Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.☆281Updated this week
- Simple and fast server for GPTQ-quantized LLaMA inference☆24May 18, 2023Updated 2 years ago
- ☆76Feb 8, 2026Updated last week
- Evaluation framework for document processing models and services.☆63Updated this week
- benchmarks for LLM tokenizers☆16Jan 15, 2026Updated 3 weeks ago
- An open-source session replay tool for single-page applications that uses AI analysis, aggregated trends, and a RAG chatbot to help devel…☆11Jan 23, 2026Updated 3 weeks ago
- ☆28Updated this week
- ☆52Sep 10, 2025Updated 5 months ago
- cloudflare workers项目,开箱即用,用于记录美好瞬间✨✨☆71Oct 20, 2025Updated 3 months ago
- Evals that meet you where you are. For AI that's grounded.☆45Feb 6, 2026Updated last week
- Text to PowerPoint Slide Generation using GPT LLM☆36Jul 31, 2024Updated last year
- ☆11Jun 27, 2024Updated last year
- This package implements 1D and 2D blood flow models for arterial circulation using Trixi.jl, enabling efficient numerical simulation and …☆44Updated this week
- A demo app showing you how to integrate with Google Cloud Translate API☆10Mar 20, 2019Updated 6 years ago