rkinas / reasoning_models_how_toLinks
This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.
β121Updated 4 months ago
Alternatives and similar repositories for reasoning_models_how_to
Users that are interested in reasoning_models_how_to are comparing it to the libraries listed below
Sorting:
- Inference, Fine Tuning and many more recipes with Gemma family of modelsβ276Updated 5 months ago
- π€ Benchmark Large Language Models Reliably On Your Dataβ419Updated this week
- Build datasets using natural languageβ552Updated 3 months ago
- β267Updated 5 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β348Updated 6 months ago
- β86Updated last year
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"β141Updated last month
- One click templates for inferencing Language Modelsβ222Updated last month
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ485Updated 3 months ago
- π¨ NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.β446Updated this week
- All credits go to HuggingFace's Daily AI papers (https://huggingface.co/papers) and the research community. πAudio summaries here (httpsβ¦β210Updated last month
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs β¦β61Updated 10 months ago
- π Automatically annotate papers using LLMsβ391Updated 2 weeks ago
- From scratch implementation of a vision language model in pure PyTorchβ254Updated last year
- β241Updated 2 months ago
- Train LLM on Hugging Face infraβ67Updated last month
- β183Updated 3 weeks ago
- Toolkit for attaching, training, saving and loading of new heads for transformer modelsβ293Updated 9 months ago
- Verifiers for LLM Reinforcement Learningβ79Updated 3 months ago
- A repository containing general tutorials I'd like to share with the world.β77Updated 2 weeks ago
- Complete implementation of Llama2 with/without KV cache & inference πβ49Updated last year
- Collection of resources for RL and Reasoningβ26Updated 10 months ago
- Fine tune Gemma 3 on an object detection taskβ92Updated 5 months ago
- Simple UI for debugging correlations of text embeddingsβ305Updated 6 months ago
- [ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMsβ314Updated 5 months ago
- SynthGenAI - Package for Generating Synthetic Datasets using LLMs.β54Updated 3 weeks ago
- A compact LLM pretrained in 9 days by using high quality dataβ336Updated 8 months ago
- Utils for Unsloth https://github.com/unslothai/unslothβ181Updated this week
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.β173Updated 11 months ago
- Simple examples using Argilla tools to build AIβ56Updated last year