轻量级大语言模型MiniMind的源码解读,包含tokenizer、RoPE、MoE、KV Cache、pretraining、SFT、LoRA、DPO等完整流程
☆1,005Jun 16, 2025Updated 11 months ago
Alternatives and similar repositories for MiniMind-in-Depth
Users that are interested in MiniMind-in-Depth are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🎓从0开始训练一个大模型Minimind项目的超详细解析,包括但不限于用到的架构,算法,以及大模型面试经验☆810Apr 17, 2026Updated last month
- 🧠「大模型」2小时完全从0训练64M的小参数LLM!Train a 64M-parameter LLM from scratch in just 2h!☆50,007Updated this week
- 🚀 [从零构建 LLM] 极简大模型训练原理与实践指南。包含 Transformer, Pretraining, SFT 核心代码与对照实验。 | A minimal, principle-first guide to understanding and building…☆111May 8, 2026Updated 2 weeks ago
- 一个基于 模型上下文协议/MCP 构建的智能医学文献分析工具。它旨在帮助科研人员、医学从业者和学生快速检索 PubMed 数据库,并利用大型语言模型 (LLM) 的能力对文献摘要进行智能分析和总结☆10May 18, 2025Updated last year
- 👀「大模型」2小时从0训练65M参数的视觉多模态VLM!Train a 65M-parameter VLM from scratch in just 2h!☆7,947Updated this week
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Nvidia Deepstream FFMPEG RTSP to HTTP Streaming☆15Sep 25, 2020Updated 5 years ago
- ☆128Oct 11, 2025Updated 7 months ago
- 本项目对Deepseek-R1-Distill-Qwen-7B进行心理咨询CoT数据的LoRA微调,以进一步提升Deepseek-R1-Distill-Qwen-7B在心理咨询领域的慢思考能力。☆12Mar 11, 2025Updated last year
- 主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题☆14,236Apr 30, 2025Updated last year
- Pytorch Lightning Implement of Generative Recommenders☆112Sep 17, 2024Updated last year
- A Pytorch implementation of WaveNet ASR (Automatic Speech Recognition)☆13Sep 22, 2021Updated 4 years ago
- 北航《并行程序设计》Lab合集(竞速Rank1)☆31Feb 23, 2023Updated 3 years ago
- 🚀 轻量视频🎥 大模型🤖☆22Apr 27, 2025Updated last year
- BUPT Joint Programme with QMUL☆20Dec 21, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- 《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程☆30,517Apr 24, 2026Updated 3 weeks ago
- 📚 从零开始构建大模型☆30,496May 6, 2026Updated 2 weeks ago
- ☆13Mar 2, 2025Updated last year
- The simplest Local Knowledge Base example based on Langchain and Chat-GLM☆13Jun 9, 2023Updated 2 years ago
- official implementation of Towards Robust Model Watermark via Reducing Parametric Vulnerability☆18Jun 3, 2024Updated last year
- 本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)☆24,286May 10, 2026Updated last week
- LaTeXDataHub is an open-source platform dedicated to the sharing and contribution of real-world LaTeX image datasets and their annotation…☆12Aug 13, 2024Updated last year
- 操作系统第三次课程项目,一个简单的文件系统☆12Jun 24, 2021Updated 4 years ago
- AI 学习之旅☆108Jul 24, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Mini_RWKV_V7_LM☆65Jan 26, 2026Updated 3 months ago
- [ACM MM 2025] Mobile U-ViT: Revisiting large kernel and U-shaped ViT for efficient medical image segmentation☆58Oct 29, 2025Updated 6 months ago
- Awesome Few-Shot Learning on Graphs☆25Apr 27, 2025Updated last year
- Data and code of paper published on EMSE: "TraceSim: An Alignment Method for Computing Stack Trace Similarity"☆10May 6, 2022Updated 4 years ago
- 《Pattern Recognition and Machine Learning》阅读讨论班☆35May 20, 2019Updated 7 years ago
- 基于FastAPI + LangChain + OpenAI API + Vue的AI表格处理工具,用于智能化处理 和分析表格数据。☆20Jul 14, 2025Updated 10 months ago
- 清华大学人工智能导论(龙明盛老师)课程课件,作业以及试题☆16Jun 26, 2023Updated 2 years ago
- 这是一个帮助新手通过 LeRobot 项目入门具身智能的中文教程☆133Jan 17, 2025Updated last year
- 一个为制作考研做题本/刷题本而设计的 LaTex 文档类☆135Updated this week
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆12Aug 25, 2023Updated 2 years ago
- [ACM MM2024] The code for HMLLM.☆11Oct 27, 2024Updated last year
- 《大模型白盒子构建指南》:一个全手搓的Tiny-Universe☆4,855Feb 12, 2026Updated 3 months ago
- ☆16Sep 22, 2023Updated 2 years ago
- 📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程☆49,333May 14, 2026Updated last week
- A石大疫情防控通每日自动汇报 ,github内部部署简单易上手☆14Oct 22, 2022Updated 3 years ago
- RePOSE: 3D Human Pose Estimation via Spatio-Temporal Depth Relational Consistency☆19Oct 2, 2024Updated last year