rkinas / rlhf_thinking_model
This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.
☆91Updated last month
Alternatives and similar repositories for rlhf_thinking_model:
Users that are interested in rlhf_thinking_model are comparing it to the libraries listed below
- Skrypty, tutoriale oraz programistyczna baza wiedzy dotycząca pracy z modelem Bielik.☆110Updated this week
- ☆213Updated last week
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆319Updated 2 months ago
- ☆74Updated 10 months ago
- GPU Kernels☆160Updated last week
- ☆129Updated 8 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆177Updated last week
- ☆130Updated 4 months ago
- making the official triton tutorials actually comprehensible☆26Updated last month
- One click templates for inferencing Language Models☆173Updated 2 weeks ago
- List of resources, libraries and more for developers who would like to build with open-source machine learning off-the-shelf☆199Updated last year
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆14Updated 3 weeks ago
- Exploring Applications of GRPO☆182Updated this week
- 🤗 Benchmark Large Language Models Reliably On Your Data☆240Updated this week
- Simple examples using Argilla tools to build AI☆52Updated 5 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 11 months ago
- Moxin is a family of fully open-source and reproducible LLMs☆86Updated last week
- ☆121Updated last week
- ☆46Updated 9 months ago
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆337Updated last month
- Hugging Face Deep Learning Containers (DLCs) for Google Cloud☆141Updated 2 months ago
- ☆44Updated 9 months ago
- ☆204Updated 10 months ago
- Make Llama 3.1 8B talk in Rick Sanchez’s style☆102Updated 3 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆431Updated 6 months ago
- A versatile and powerful library designed to streamline the process of querying LLMs☆83Updated 2 weeks ago
- Low-Rank adapter extraction for fine-tuned transformers models☆171Updated 11 months ago
- An extension of the nanoGPT repository for training small MOE models.☆129Updated last month
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆31Updated 2 months ago
- minimal GRPO implementation from scratch☆72Updated last month