golfxiao/ScratchLLMStepByStep

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/golfxiao/ScratchLLMStepByStep)

golfxiao / ScratchLLMStepByStep

一个手把手教你从零开始编写GPT并训练大语言模型的教程

☆102

Alternatives and similar repositories for ScratchLLMStepByStep

Users that are interested in ScratchLLMStepByStep are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AIDajiangtang / LLM-from-scratch
View on GitHub
从零开始学大模型Transformer、GPT2、BERT pre-training and fine-tuning from scratch
☆41Jul 1, 2024Updated 2 years ago
KaihuaTang / Building-a-Small-LLM-from-Scratch
View on GitHub
该系列的目的是让读者可以在基础的pytorch上，不依赖任何其他现成的外部库，从零开始理解并实现一个大语言模型的所有组成部分，以及训练微调代码，因此读者仅需python，pytorch和最基础深度学习背景知识即可。
☆384Aug 28, 2025Updated 10 months ago
victorAmazing99 / chat-ollama
View on GitHub
☆15Apr 23, 2025Updated last year
Mog9 / gpt2-inference
View on GitHub
A GPT-2 inference engine written from scratch in CUDA and C++. Implements custom CUDA kernels for tiled matrix multiplication, LayerNorm,…
☆42May 17, 2026Updated 2 months ago
FareedKhan-dev / train-deepseek-r1
View on GitHub
Building DeepSeek R1 from Scratch
☆782Mar 21, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
DUTIR-Emotion-Group / CCL2025-Chinese-Hate-Speech-Detection
View on GitHub
☆22Mar 1, 2025Updated last year
tangyipeng100 / Modelscope_sora_solution5
View on GitHub
Modelscope-Sora挑战赛第五名参赛方案
☆12Sep 12, 2024Updated last year
Raxxll / ChatBI
View on GitHub
本项目旨在利用LangChain和大语言模型（如ZhipuAI）开发一个智能数据库问答系统。该系统能够通过自然语言理解用户的查询请求，自动生成相应的SQL语句并执行，最后将查询结果以自然语言形式返回用户。
☆15Jul 31, 2024Updated last year
fshanCoder / rucbase-lab
View on GitHub
a lab from ruc base
☆12Jan 24, 2023Updated 3 years ago
lxj-drifter / UIOU_files
View on GitHub
☆19Apr 11, 2024Updated 2 years ago
KMnO4-zx / tiny-llm
View on GitHub
☆34Jul 8, 2025Updated last year
brendanhogan / DeepSeekRL-Extended
View on GitHub
Exploring Applications of GRPO
☆252Aug 25, 2025Updated 10 months ago
Xpryet / react-native-wireguard
View on GitHub
☆10Feb 17, 2021Updated 5 years ago
rustsbi / sbi-rt
View on GitHub
Simple RISC-V SBI runtime library; designated for supervisor use
☆25Jan 10, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
tony612 / kexplain
View on GitHub
Kexplain is an interactive kubectl explain
☆12Oct 23, 2023Updated 2 years ago
QiangGer / auto-Canny
View on GitHub
Canny算法的自适应阈值改进
☆17Aug 15, 2019Updated 6 years ago
chai2010 / llmgo-book
View on GitHub
Go和大语言模型编程
☆45Mar 5, 2025Updated last year
hellangleZ / Agent-MemoryForge
View on GitHub
Production-grade memory layer for AI agents with durable multi-tenant memory, semantic recall, async distillation, and SDK/Gateway integr…
☆177Jun 14, 2026Updated last month
tj / go-writer-stats
View on GitHub
Wrap an io.Writer for metrics.
☆10May 8, 2018Updated 8 years ago
Rollmops / PyPrepar3D
View on GitHub
The PyPrepar3D project aims to provide a high level python api to Lockeed Martin´s Prepar3D SDK. The Docuentation for the C/C++ SDK can b…
☆11Dec 3, 2015Updated 10 years ago
sunshine-JLU / deepseek-r1-distill-llama-8b-lora
View on GitHub
The objective of this project is to demonstrate how to fine-tune deepseek-r1-distill-llama-8b.
☆17Feb 19, 2025Updated last year
ACALJJ32 / watermark_remove_paddle_slbr
View on GitHub
☆10May 23, 2022Updated 4 years ago
tj / go-rle
View on GitHub
Run-length encoding utils for Go
☆13May 8, 2018Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Justlovesmile / EARL
View on GitHub
[TGRS 2023] Official code for "EARL: An Elliptical Distribution aided Adaptive Label Assignment for Oriented Object Detection in Remote S…
☆14Apr 16, 2026Updated 3 months ago
ming71 / CUDA
View on GitHub
useful cuda code .
☆43Mar 11, 2022Updated 4 years ago
sociomantic-tsunami / dhtnode
View on GitHub
Distributed hash-table node
☆13Oct 2, 2023Updated 2 years ago
godruoyi / imageflow
View on GitHub
Imageflow is a Raycast extension that allows you to process images using a customizable workflow. You can resize, compress, and convert i…
☆13Apr 9, 2025Updated last year
JuniMay / llm.rs
View on GitHub
An attempt to migrate Karpathy's llm.c to safe rust.
☆13Jun 4, 2024Updated 2 years ago
yui0 / ugemm
View on GitHub
GEMM
☆10Aug 26, 2023Updated 2 years ago
kennylevinsen / ecies
View on GitHub
Curve25519 ECIES
☆10Oct 18, 2016Updated 9 years ago
JINO-ROHIT / ml-systems-notes
View on GitHub
a personal collection of my notes for ml sys
☆107Updated this week
SWRMLabs / ants-db
View on GitHub
Distributed KV store using go-ds-crdt and libp2p
☆12Nov 28, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gouzigouzi / attention-residuals-for-chinese-llms
View on GitHub
A Chinese-focused PyTorch framework for exploring Attention Residuals in Qwen3-style causal LMs, with baseline, Block AttnRes, Full AttnR…
☆19May 3, 2026Updated 2 months ago
datawhalechina / wow-rag
View on GitHub
A simple and trans-platform rag framework and tutorial
☆231Jan 17, 2026Updated 6 months ago
kaaeaate / 2d_gaussian_splatting
View on GitHub
☆11Dec 16, 2023Updated 2 years ago
mattsse / hyperswarm-dht
View on GitHub
rust implementation fo the DHT powering the HyperSwarm stack
☆18Apr 1, 2022Updated 4 years ago
maswx / vu13p_corundum
View on GitHub
corundum work on vu13p
☆23Nov 10, 2023Updated 2 years ago
AlexwellChen / Toy_ML_Framework
View on GitHub
☆11May 16, 2026Updated 2 months ago
ash-neupane / multi-token-pred
View on GitHub
Train toy models using multi-token prediction objective
☆14Apr 18, 2026Updated 3 months ago