rkinas / reasoning_models_how_toView external linksLinks
This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.
☆127Jul 28, 2025Updated 6 months ago
Alternatives and similar repositories for reasoning_models_how_to
Users that are interested in reasoning_models_how_to are comparing it to the libraries listed below
Sorting:
- Skrypty, tutoriale oraz programistyczna baza wiedzy dotycząca pracy z modelem Bielik.☆169Jun 7, 2025Updated 8 months ago
- Computational Neuroscience stuff☆13Aug 12, 2019Updated 6 years ago
- Enterprise-grade AI Detection and Response platform with real-time monitoring and configuration management for AI Agents and Large Langua…☆33Dec 14, 2025Updated 2 months ago
- An MCP server providing intelligent transcript processing capabilities, featuring natural formatting, contextual repair, and smart summar…☆18Mar 14, 2025Updated 11 months ago
- story based implementation for sequential thinking☆15Dec 15, 2025Updated 2 months ago
- A Python implementation of the Sequential Thinking MCP server using the official Model Context Protocol (MCP) Python SDK. This server fac…☆24Jun 1, 2025Updated 8 months ago
- A TypeScript Model Context Protocol (MCP) server to allow LLMs to programmatically construct mind maps to explore an idea space, with enf…☆26Mar 23, 2025Updated 10 months ago
- ☆18Apr 18, 2025Updated 9 months ago
- Deploy and scale Large Language Models (LLMs) in production.☆38Jul 20, 2024Updated last year
- ☆11Feb 6, 2026Updated last week
- Pre-trained models and language resources for Natural Language Processing in Polish☆370Jun 5, 2024Updated last year
- Simple ideas to compare Agentic Coding Tools☆36Jun 29, 2025Updated 7 months ago
- ☆83May 31, 2024Updated last year
- MCP DeepResearch Server: 基于 LangGraph + Ollama + Tavily 的深度研究服务器,支持异步运行、超时控制与进度推送☆31Jun 16, 2025Updated 7 months ago
- Shared personal notes created while working with the Apple MLX machine learning framework☆24Dec 12, 2025Updated 2 months ago
- Exercises Galois theory D. Cox☆12Jun 29, 2023Updated 2 years ago
- Your self-hosted AI assistant. Interactive Linux Shell, Files and Folders analysis. Powered by Ollama.☆37Updated this week
- Embed your LLM into a python function☆22Jan 9, 2025Updated last year
- A simple lightweight Model Context Protocol (MCP) server integration framework☆17Jan 23, 2026Updated 3 weeks ago
- 超简单复现Deepseek-R1-Zero和Deepseek-R1,以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL,以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of Dee…☆33Apr 5, 2025Updated 10 months ago
- examples and guides to using Nomic Atlas☆37Apr 18, 2025Updated 9 months ago
- Timelight: Universal Path Generator☆22Aug 24, 2025Updated 5 months ago
- A simple WeChat Official Account layout tool based on Dify☆16Jun 27, 2025Updated 7 months ago
- Build internal agents with just backend code.☆39Aug 25, 2025Updated 5 months ago
- Structured TRIZ prompt engineering for LLMs in an open, portable XML format – MIT licensed.☆14Nov 11, 2025Updated 3 months ago
- AuraMatrix is personality analysis web which using llm to do evaluation. I have made this for Gyanotsav-2025 to show different ways to ut…☆11Dec 22, 2025Updated last month
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 8 months ago
- Just a simple HowTo for https://github.com/johnsmith0031/alpaca_lora_4bit☆32May 25, 2023Updated 2 years ago
- Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good mode…☆35May 25, 2021Updated 4 years ago
- ☆40Jan 6, 2025Updated last year
- any4any是一个企业级多模态AI平台,提供完整的智能交互解决方案。集成了大语言模型对话、数字人系统、智能SQL查询、语音处理、知识库系统等核心功能,支持OpenAI兼容API接口,可无缝集成到各类AI应用中。☆60Nov 10, 2025Updated 3 months ago
- CLI to generate LangGraph stubs from a specification☆104Mar 20, 2025Updated 10 months ago
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated last month
- Workflow automation, but you just describe what you want and it happens.☆26Nov 22, 2025Updated 2 months ago
- Python library providing a Polars DataFrame interface for easy and intuitive access to the Bloomberg API☆16Jan 9, 2026Updated last month
- Modular banking application demonstrating concurrent transaction processing, DDD and production security patterns with Spring Boot and Po…☆10Oct 21, 2025Updated 3 months ago
- MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces☆10Mar 24, 2025Updated 10 months ago
- ☆11Aug 29, 2025Updated 5 months ago
- VibEx (vx) is a developer-friendly CLI tool that streamlines the process of working with AI coding assistants. It helps developers prepar…☆28May 17, 2025Updated 8 months ago