rkinas / rlhf_thinking_model
This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.
☆93Updated last month
Alternatives and similar repositories for rlhf_thinking_model
Users that are interested in rlhf_thinking_model are comparing it to the libraries listed below
Sorting:
- ☆132Updated 5 months ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆75Updated 2 weeks ago
- One click templates for inferencing Language Models☆178Updated this week
- ☆114Updated 4 months ago
- Skrypty, tutoriale oraz programistyczna baza wiedzy dotycząca pracy z modelem Bielik.☆111Updated 3 weeks ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆295Updated this week
- ☆129Updated 8 months ago
- A simple tool that let's you explore different possible paths that an LLM might sample.☆170Updated last week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆282Updated 3 weeks ago
- chrome & firefox extension to chat with webpages: local llms☆116Updated 4 months ago
- ☆219Updated this week
- ☆97Updated 7 months ago
- An introduction to LLM Sampling☆78Updated 5 months ago
- ☆125Updated last month
- ☆74Updated 7 months ago
- ☆49Updated 10 months ago
- ☆75Updated 11 months ago
- ☆120Updated last month
- ☆138Updated 3 weeks ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆139Updated 2 months ago
- Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…☆42Updated last week
- ☆151Updated 5 months ago
- One click away from a locally downloaded, fine-tuned model, hosted on hugging face, with inference built in. In two hours.☆21Updated 2 months ago
- Let's build better datasets, together!☆259Updated 4 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆439Updated 7 months ago
- ☆89Updated last month
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆198Updated last year
- ⚖️ Awesome LLM Judges ⚖️☆97Updated 2 weeks ago
- ☆86Updated 4 months ago
- LLM Chess - Large Language Models Competing in Chess☆40Updated this week