ash80 / RLHF_in_notebooksView external linksLinks
RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks
☆231Jun 20, 2025Updated 7 months ago
Alternatives and similar repositories for RLHF_in_notebooks
Users that are interested in RLHF_in_notebooks are comparing it to the libraries listed below
Sorting:
- 🧠 Web AI / LLM in browser / Whisper in browser / WebGPU inference Examples☆27Oct 1, 2025Updated 4 months ago
- ☆10Jan 23, 2025Updated last year
- ☆13Aug 12, 2024Updated last year
- Official codebase for our NeurIPS paper, Symmetry-Informed Governing Equation Discovery.☆11Nov 13, 2024Updated last year
- Augmented AI decision framework☆26Jan 26, 2026Updated 2 weeks ago
- Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.☆36Jul 2, 2025Updated 7 months ago
- ☆14Jun 6, 2024Updated last year
- ☆19Nov 9, 2024Updated last year
- A tool to sync source directories and files to a destination directory☆19Jan 14, 2024Updated 2 years ago
- Search, browse, and resume your Claude Code sessions. Fast.☆41Feb 2, 2026Updated last week
- ☆531Jul 1, 2025Updated 7 months ago
- This is Veritas Research☆57Updated this week
- 一个将视频转换为PPT的桌面应用。☆23Nov 14, 2021Updated 4 years ago
- A JPEG Image Compression Service using Part Homomorphic Encryption.☆31Mar 7, 2025Updated 11 months ago
- ☆29Nov 9, 2025Updated 3 months ago
- An MCP server for playing Minesweeper☆108Mar 20, 2025Updated 10 months ago
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Aug 3, 2024Updated last year
- TideCloak lets your users hold their own digital authority—no central control, no blind trust.☆64Jul 28, 2025Updated 6 months ago
- Proof of concept for a VPN over UDP☆113Feb 3, 2026Updated last week
- Auto Thinking Mode switch for Qwen3 in Open webui☆70May 8, 2025Updated 9 months ago
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated last year
- Create sites with base44 and use as standalone☆17Jan 18, 2026Updated 3 weeks ago
- A Survey Analyzing Generalization in Deep Reinforcement Learning☆36Oct 31, 2024Updated last year
- ☆263Mar 27, 2024Updated last year
- A Streamlit app for generating high-quality Q&A training datasets from text and PDFs, leveraging Gemini, Claude, and OpenAI for LLM fine-…☆39Jul 5, 2025Updated 7 months ago
- Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.☆281Updated this week
- ☆67Dec 24, 2025Updated last month
- Data mapping framework for rust stuff☆45Updated this week
- Chrome/Edge extension to force Netflix 4K streaming on unsupported browsers/devices☆90Feb 7, 2026Updated last week
- a simple social media researcher built with vercels ai sdk☆42Aug 17, 2025Updated 5 months ago
- A very good course about finetune llm for beginners【大模型微调】☆28Aug 21, 2025Updated 5 months ago
- ☆28Updated this week
- open-source macOS app to help with digital eye strain☆89Nov 21, 2025Updated 2 months ago
- Text to PowerPoint Slide Generation using GPT LLM☆35Jul 31, 2024Updated last year
- Evals that meet you where you are. For AI that's grounded.☆45Feb 6, 2026Updated last week
- ☆52Sep 10, 2025Updated 5 months ago
- Applying the ideas of Deepseek R1 to computer use☆221Feb 6, 2025Updated last year
- A Python toolkit for chain-of-thought prompting 🐍☆183Dec 13, 2025Updated 2 months ago
- Automatic differentiation implemented in python, inspired by Pytorch (easily extensible)☆88Feb 21, 2023Updated 2 years ago