bbruceyuan / bit-brainLinks
最少使用 3090 即可训练自己的比特大脑(miniLLM)🧠(进行中). Train your own BitBrain(A mini LLM) with just an RTX 3090 minimum.
☆22Updated this week
Alternatives and similar repositories for bit-brain
Users that are interested in bit-brain are comparing it to the libraries listed below
Sorting:
- ☆41Updated 3 months ago
- 大型语言模型实战指南:应用实践与场景落地☆73Updated 9 months ago
- ☆22Updated 4 months ago
- 通义千问的DPO训练☆49Updated 9 months ago
- from MHA, MQA, GQA to MLA by 苏剑林, with code☆22Updated 4 months ago
- LLM RAG 应用,支持 API 调用,语音交互。☆11Updated last year
- 天池算法比赛《BetterMixture - 大模型数据混合挑战赛》的第一名top1解决方案☆31Updated 11 months ago
- 想要从零开始训练一个中文的mini大语言模型,可以进行基本的对话,模型大小根据手头的机器决定☆60Updated 10 months ago
- AFAC2024金融智能创新大赛☆43Updated 7 months ago
- 解锁HuggingFace生态的百般用法☆91Updated 6 months ago
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆80Updated last month
- unify-easy-llm(ULM)旨在打造一个简易的一键式大模型训练工具,支持Nvidia GPU、Ascend NPU等不同硬件以及常用的大模型。☆55Updated 11 months ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆45Updated 4 months ago
- 纯c++的全平台llm加速库,支持python调用,支持baichuan, glm, llama, moss基座,手机端流畅运行chatglm-6B级模型单卡可达10000+token / s,☆45Updated last year
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆82Updated 10 months ago
- 中文预训练ModernBert☆69Updated 2 months ago
- 官方transformers源码解析。AI大模型时代,pytorch、transformer是新操作系统,其他都是运行在其上面的软件。☆17Updated last year
- 大语言模型应用:RAG、NL2SQL、聊天机器人、预训练、MOE混合专家模型、微调训练、强化学习、天池数据竞赛☆62Updated 4 months ago
- Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory☆27Updated last year
- 模型压缩的小白入门教程☆22Updated 11 months ago
- ☆109Updated 7 months ago
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆53Updated 7 months ago
- 欢迎来到“筱可AI研习社”的实战项目仓库!这个仓库主要用于存储和展示为公众号撰写的各类实战项目。我们会不断优化和迭代这些项目,以探索AI的无限可能。☆56Updated this week
- a toolkit on knowledge distillation for large language models☆95Updated this week
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆63Updated last year
- 学习vLLM,使用vLLM部署Qwen2-0.5B的模型,并使用docker部署。☆18Updated last year
- 本项目致力于为大模型领域的初学者提供全面的知识体系,包括基础和高阶内容,以便开发者能迅速掌握大模型技术栈并全面了解相关知识。☆61Updated 5 months ago
- accelerate generating vector by using onnx model☆17Updated last year
- 演示Gemma中文指令微调的教程☆46Updated last year