rkinas / reasoning_models_how_toLinks
This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.
β110Updated last month
Alternatives and similar repositories for reasoning_models_how_to
Users that are interested in reasoning_models_how_to are comparing it to the libraries listed below
Sorting:
- Inference, Fine Tuning and many more recipes with Gemma family of modelsβ268Updated 2 months ago
- All credits go to HuggingFace's Daily AI papers (https://huggingface.co/papers) and the research community. πAudio summaries here (httpsβ¦β196Updated this week
- One click templates for inferencing Language Modelsβ214Updated last month
- Build datasets using natural languageβ529Updated last week
- π€ Benchmark Large Language Models Reliably On Your Dataβ395Updated 3 weeks ago
- β155Updated 5 months ago
- β264Updated 3 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ451Updated last month
- β175Updated last month
- β86Updated last year
- A compact LLM pretrained in 9 days by using high quality dataβ327Updated 5 months ago
- Simple examples using Argilla tools to build AIβ55Updated 10 months ago
- Verifiers for LLM Reinforcement Learningβ75Updated 2 weeks ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.β83Updated last month
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb datasetβ¦β15Updated 6 months ago
- β119Updated last year
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ226Updated this week
- Self-Adapting Language Modelsβ800Updated last month
- Banishing LLM Hallucinations Requires Rethinking Generalizationβ276Updated last year
- minimal GRPO implementation from scratchβ97Updated 6 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β336Updated 3 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)β450Updated last year
- Collection of resources for RL and Reasoningβ26Updated 7 months ago
- Exploring Applications of GRPOβ250Updated last month
- β135Updated last month
- β182Updated 7 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β240Updated 10 months ago
- Testing LLM reasoning abilities with family relationship quizzes.β63Updated 7 months ago
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.β185Updated last year
- Build your own visual reasoning modelβ409Updated 3 weeks ago