轻量级大语言模型MiniMind的源码解读,包含tokenizer、RoPE、MoE、KV Cache、pretraining、SFT、LoRA、DPO等完整流程
☆958Jun 16, 2025Updated 10 months ago
Alternatives and similar repositories for MiniMind-in-Depth
Users that are interested in MiniMind-in-Depth are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🎓从0开始训练一个大模型Minimind项目的超详细解析,包括但不限于用到的架构,算法,以及大模型面试经验☆703Apr 17, 2026Updated 2 weeks ago
- 从零复现 minimind👉minimind-v☆292Dec 24, 2025Updated 4 months ago
- 🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!☆48,315Apr 24, 2026Updated last week
- 🚀 [从零构建 LLM] 极简大模型训练原理与实践指南。包含 Transformer, Pretraining, SFT 核心代码与对照实验。 | A minimal, principle-first guide to understanding and building…☆97Apr 5, 2026Updated 3 weeks ago
- 🚀 「大模型」2小时从0训练65M参数的视觉多模态VLM!🌏 Train a 65M-parameter VLM from scratch in just 2 hours!☆7,685Updated this week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning☆50Mar 25, 2026Updated last month
- ☆130Oct 11, 2025Updated 6 months ago
- 本项目对Deepseek-R1-Distill-Qwen-7B进行心理咨询CoT数据的LoRA微调,以进一步提升Deepseek-R1-Distill-Qwen-7B在心理咨询领域的慢思考能力。☆12Mar 11, 2025Updated last year
- I love reinforcement learning.☆13Jan 15, 2025Updated last year
- 主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题☆14,035Apr 30, 2025Updated last year
- Be notified when reservations open up at OpenTable☆11Jul 22, 2017Updated 8 years ago
- Pytorch Lightning Implement of Generative Recommenders☆110Sep 17, 2024Updated last year
- 北航《并行程序设计》Lab合集(竞速Rank1)☆31Feb 23, 2023Updated 3 years ago
- 🚀 轻量视频🎥 大模型🤖☆22Apr 27, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- BUPT Joint Programme with QMUL☆21Dec 21, 2023Updated 2 years ago
- 《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程☆30,170Apr 24, 2026Updated last week
- Research project on models of opinion formation☆16Jul 19, 2016Updated 9 years ago
- LinkerHand Dexterous Hands URDF Model Files Repository.☆33Apr 14, 2026Updated 2 weeks ago
- 📚 从零开始构建大模型☆29,752Mar 16, 2026Updated last month
- Kafka Helm Charts☆14Jun 12, 2025Updated 10 months ago
- ☆741Jan 12, 2026Updated 3 months ago
- The simplest Local Knowledge Base example based on Langchain and Chat-GLM☆13Jun 9, 2023Updated 2 years ago
- 开源许可证助手☆16Jun 20, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)☆24,109Mar 12, 2026Updated last month
- MiniGPT-Pancreas: Multimodal Large language Model for Pancreas Cancer Classification and Detection☆12Sep 19, 2025Updated 7 months ago
- 操作系统第三次课程项目,一个简单的文件系统☆12Jun 24, 2021Updated 4 years ago
- [ACM MM 2025] Mobile U-ViT: Revisiting large kernel and U-shaped ViT for efficient medical image segmentation☆57Oct 29, 2025Updated 6 months ago
- ☆35Jul 9, 2020Updated 5 years ago
- Awesome Few-Shot Learning on Graphs☆25Apr 27, 2025Updated last year
- Data and code of paper published on EMSE: "TraceSim: An Alignment Method for Computing Stack Trace Similarity"☆10May 6, 2022Updated 3 years ago
- An MCP Server that provides IP geolocation lookup (country, region, city, etc.) via ip-api.com.☆13May 26, 2025Updated 11 months ago
- 📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程☆40,264Apr 23, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆12Aug 25, 2023Updated 2 years ago
- Highly interactive graph data visualization☆15Oct 13, 2021Updated 4 years ago
- The supplementary material for the paper "Fine-tuning Large Language Models to Improve Accuracy and Comprehensibility of Automated Code R…☆16Aug 12, 2024Updated last year
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"☆25Mar 10, 2026Updated last month
- 《大模型白盒子构建指南》:一个全手搓的Tiny-Universe☆4,781Feb 12, 2026Updated 2 months ago
- ☆16Sep 22, 2023Updated 2 years ago
- 基于InternLm chat 7B大模型基座,构建一个Agent ,可以调用 MMYOLO 工具来完成图像内视觉任务☆11Oct 30, 2024Updated last year