该系列的目的是让读者可以在基础的pytorch上,不依赖任何其他现成的外部库,从零开始理解并实现一个大语言模型的所有组成部分,以及训练微调代码,因此读者仅需python,pytorch和最基础深度学习背景知识即可。
☆381Aug 28, 2025Updated 6 months ago
Alternatives and similar repositories for Building-a-Small-LLM-from-Scratch
Users that are interested in Building-a-Small-LLM-from-Scratch are comparing it to the libraries listed below
Sorting:
- Learning records for building a large language model from scratch☆59Jan 1, 2025Updated last year
- 一个手把手教你从零开始编写GPT并训练大语言模型的教程☆96Jan 20, 2025Updated last year
- Building DeepSeek R1 from Scratch☆748Mar 21, 2025Updated 11 months ago
- LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.☆867Dec 10, 2025Updated 2 months ago
- ☆134Feb 17, 2025Updated last year
- AI Agent 开发实战☆999Nov 30, 2024Updated last year
- Large Language Model in Action☆342Jan 28, 2025Updated last year
- A OS toy writen by pure rust☆140Oct 28, 2024Updated last year
- LLMs-from-scratch项目中文翻译☆2,379Oct 15, 2025Updated 4 months ago
- DeepSeek 系列工作解读、扩展和复现。☆699Mar 29, 2025Updated 11 months ago
- ☆51Feb 5, 2025Updated last year
- 这是一个简单的技术科普教程项目,主要聚焦于解释一些有趣的,前沿的技术概念和原理。每篇文章都力求在 5 分钟内阅读完成。☆6,730Nov 10, 2025Updated 3 months ago
- Implementation for ECCV 2022 Paper "Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generali…☆20Jul 18, 2022Updated 3 years ago
- ☆54Nov 14, 2024Updated last year
- convert GitHub issues to a website☆28Mar 2, 2026Updated last week
- A book for Learning the Foundations of LLMs☆15,861Dec 12, 2025Updated 2 months ago
- A powerful Golang CLI application scaffold integrated with Logrus, arg parser, toml config, testify, Makefile, VSCode and Github Action.☆19Nov 2, 2023Updated 2 years ago
- Fetch arxiv data to LLM-friendly text☆130Feb 18, 2026Updated 2 weeks ago
- Spark projects. Learning book "Machine Learning with Spark"☆10Jun 3, 2017Updated 8 years ago
- Turn PostgreSQL into your search engine in a Pythonic way.☆51Aug 29, 2025Updated 6 months ago
- Taming LLMs: A Practical Guide to LLM Pitfalls with Open Source Software☆338Feb 5, 2025Updated last year
- 手写一个迷你版本的Tomcat,实现了静态、动态资源的访问。☆10Dec 27, 2020Updated 5 years ago
- Detect and remove unused dependencies for Python projects☆18Apr 5, 2025Updated 11 months ago
- LaTeX 讲座资料☆12Apr 7, 2022Updated 3 years ago
- 🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!☆40,754Feb 6, 2026Updated last month
- Profile your CoreML models directly from Python 🐍☆30Sep 8, 2025Updated 6 months ago
- Enable tool-use ability for any LLM model (DeepSeek V3/R1, etc.)☆58May 27, 2025Updated 9 months ago
- Static suckless single batch CUDA-only qwen3-0.6B mini inference engine☆545Sep 8, 2025Updated 6 months ago
- [ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Model…☆162May 30, 2025Updated 9 months ago
- ☆19Sep 10, 2025Updated 5 months ago
- The SwiftUI learning project.☆11Nov 6, 2021Updated 4 years ago
- thinking tool for claude desktop/mcp clients using Deepseek reasoner☆56Jan 28, 2025Updated last year
- 大模型算法岗面试题(含答案):常见问题和概念解析 "大模型面试题"、"算法岗面试"、"面试常见问题"、"大模型算法面试"、"大模型应用基础"☆1,653Updated this week
- A command line tool that displays information about the current system, including hardware and critical software.☆29Dec 24, 2025Updated 2 months ago
- Example project using Cloudflare R2 for Django Static Files and Media Uploads☆12Aug 5, 2024Updated last year
- A lightweight operating system abstraction layer for agents.☆16Dec 26, 2025Updated 2 months ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision☆320Jan 25, 2026Updated last month
- Everything you need to know to build your own RAG application☆4,040Nov 22, 2025Updated 3 months ago
- 系统设计面试:内幕指南(System Design Interview: An Insider’s Guide)☆2,712Feb 6, 2026Updated last month