ShaohonChen / transformers_from_scratchLinks

pretrain a wiki llm using transformers

☆48

Alternatives and similar repositories for transformers_from_scratch

Users that are interested in transformers_from_scratch are comparing it to the libraries listed below

Sorting:

zhaibowen / Retriever
Retriever-0.1B
☆93Updated last year
owenliang / qwen-vllm
通义千问VLLM推理部署DEMO
☆595Updated last year
cwxndl / LLM
大语言模型应用：RAG、NL2SQL、聊天机器人、预训练、MOE混合专家模型、微调训练、强化学习、天池数据竞赛
☆66Updated 5 months ago
AI-Study-Han / Zero-Chatgpt
从0开始，将chatgpt的技术路线跑一遍。
☆250Updated 11 months ago
REXWindW / my_llm
尝试自己从头写一个LLM，参考llama和nanogpt
☆64Updated last year
owenliang / agent
qwen ai agent
☆135Updated last year
Mxoder / LLM-from-scratch
一些 LLM 方面的从零复现笔记
☆210Updated 3 months ago
open-chinese / alpaca-chinese-dataset
Alpaca Chinese Dataset -- 中文指令微调数据集
☆210Updated 10 months ago
jiahe7ay / MINI_LLM
This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.
☆460Updated 3 months ago
MetaGLM / OpenLM
本项目致力于为大模型领域的初学者提供全面的知识体系，包括基础和高阶内容，以便开发者能迅速掌握大模型技术栈并全面了解相关知识。
☆60Updated 7 months ago
Tongjilibo / build_MiniLLM_from_scratch
从0到1构建一个MiniLLM (pretrain+sft+dpo实践中)
☆462Updated 4 months ago
waylandzhang / DeepSeek-RL-Qwen-0.5B-GRPO-gsm8k
☆85Updated 6 months ago
OpenBMB / MiniCPM-CookBook
This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…
☆268Updated last month
KMnO4-zx / TinyRAG
TinyRAG
☆317Updated last month
Nipi64310 / RAG-Book
本项目为书籍《大模型RAG实战》的代码以及资料汇总。
☆245Updated 8 months ago
Zeyi-Lin / Qwen3-Medical-SFT
Qwen3 Fine-tuning: Medical R1 Style Chat
☆132Updated 2 months ago
datawhalechina / llm-deploy
大模型/LLM推理和部署理论与实践
☆304Updated 3 weeks ago
hzg0601 / LLM-Notes
大模型技术栈一览
☆108Updated 10 months ago
datawhalechina / unlock-hf
解锁HuggingFace生态的百般用法
☆93Updated 7 months ago
liucongg / LLMsBook
大型语言模型实战指南：应用实践与场景落地
☆75Updated 10 months ago
Sshuoshuo / easy-rag
快速入门RAG与私有化部署
☆199Updated last year
charent / Phi2-mini-Chinese
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型，支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
☆561Updated last year
KMnO4-zx / TinyAgent
基于ReAct手搓一个Agent Demo
☆144Updated last month
SmartFlowAI / LLM101n-CN
LLM101n: Let's build a Storyteller 中文版
☆132Updated 11 months ago
owenliang / qwen-dpo
通义千问的DPO训练
☆51Updated 10 months ago
bbruceyuan / LLMs-101
从零到一实现一个 miniLLM～（动手学习LLM）
☆75Updated last year
Tongyi-EconML / FinQwen
FinQwen: 致力于构建一个开放、稳定、高质量的金融大模型项目，基于大模型搭建金融场景智能问答系统，利用开源开放来促进「AI+金融」。
☆402Updated last year
datawhalechina / unlock-deepseek
DeepSeek 系列工作解读、扩展和复现。
☆667Updated 4 months ago
qiufengqijun / mini_qwen
这是一个从头训练大语言模型的项目，包括预训练、微调和直接偏好优化，模型拥有1B参数，支持中英文。
☆536Updated 5 months ago
mobvoi / seq-monkey-data
☆151Updated last year