ash80 / RLHF_in_notebooksLinks
RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks
☆111Updated 3 weeks ago
Alternatives and similar repositories for RLHF_in_notebooks
Users that are interested in RLHF_in_notebooks are comparing it to the libraries listed below
Sorting:
- A comprehensive suite of tools, built to liberate science by making the creation, evaluation, and dissemination of research more transpar…☆197Updated 3 weeks ago
- Implement recursion using English as the programming language and an LLM as the runtime.☆237Updated 2 years ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆221Updated 6 months ago
- Dead Simple LLM Abliteration☆220Updated 4 months ago
- Run and explore Llama models locally with minimal dependencies on CPU☆191Updated 9 months ago
- Docker-based inference engine for AMD GPUs☆231Updated 9 months ago
- Examples and guides for using the VLM Run API☆281Updated this week
- ai for jq☆243Updated 9 months ago
- Animating R1's thoughts.☆383Updated 4 months ago
- See Through Your Models☆398Updated this week
- High-Performance Implementation of OpenAI's TikToken.☆416Updated last week
- ☆214Updated 4 months ago
- Generate Cool-Looking Mazes and Animations Illustrating the A* Pathfinding Algorithm☆177Updated 4 months ago
- ☆195Updated 2 months ago
- Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…☆285Updated last week
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆222Updated 6 months ago
- R.L. methods and techniques.☆196Updated 7 months ago
- ☆163Updated 3 months ago
- Fully neural approach for text chunking☆363Updated 2 months ago
- Applying the ideas of Deepseek R1 to computer use☆214Updated 5 months ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆126Updated 2 months ago
- MCP server and CLI tool for searching and downloading documents from Anna's Archive☆281Updated this week
- Your personal plug and play memory layer for LLMs☆336Updated this week
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆252Updated last year
- CleverBee - The Open Source Deep Researcher Tool☆302Updated last month
- An experimental transformer stack and symbolic computation engine built entirely from first principles in pure Python.☆37Updated 2 months ago
- A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private co…☆279Updated last month
- Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems☆105Updated 3 months ago
- Browser-LLM Auto-Scaling Technology☆528Updated this week
- Ask GPT to run a command☆195Updated 3 months ago