rkinas / rlhf_thinking_model
This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.
☆91Updated last week
Alternatives and similar repositories for rlhf_thinking_model:
Users that are interested in rlhf_thinking_model are comparing it to the libraries listed below
- ☆72Updated 10 months ago
- Skrypty, tutoriale oraz programistyczna baza wiedzy dotycząca pracy z modelem Bielik.☆104Updated last week
- ☆127Updated 3 months ago
- Make Llama 3.1 8B talk in Rick Sanchez’s style☆76Updated 2 months ago
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆308Updated last month
- ☆87Updated last week
- Testing LLM reasoning abilities with family relationship quizzes.☆62Updated 2 months ago
- ☆46Updated 8 months ago
- Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…☆39Updated 3 weeks ago
- A versatile and powerful library designed to streamline the process of querying LLMs☆82Updated this week
- model activation visualiser☆90Updated this week
- chrome & firefox extension to chat with webpages: local llms☆113Updated 3 months ago
- ☆126Updated 7 months ago
- ☆14Updated this week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆264Updated 3 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆169Updated last week
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆107Updated last month
- ☆134Updated last month
- ☆143Updated 8 months ago
- This project showcases an LLMOps pipeline that fine-tunes a small-size LLM model to prepare for the outage of the service LLM.☆302Updated last month
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆208Updated 5 months ago
- ☆189Updated 3 months ago
- Simple examples using Argilla tools to build AI☆53Updated 4 months ago
- One click away from a locally downloaded, fine-tuned model, hosted on hugging face, with inference built in. In two hours.☆21Updated 3 weeks ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆138Updated last month
- An automated tool for discovering insights from research papaer corpora☆137Updated 9 months ago
- ☆138Updated this week
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆117Updated this week
- ☆255Updated 3 months ago
- ☆209Updated this week