wei-potato / Train-llm-from-scratch
使用deepspeed从头开始训练一个LLM,经过pretrain和sft阶段,验证llm学习知识、理解语言、回答问题的能力
☆155Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for Train-llm-from-scratch
- Support mixed-precsion inference with vllm☆95Updated 2 weeks ago
- This tool(enhance_long) aims to enhance the LlaMa2 long context extrapolation capability in the lowest-cost approach, preferably without …☆47Updated 11 months ago
- 本项目旨在结合以往研究人员的代表性工作,从多个维度评估sft数据,并自动化过滤sft数据。☆55Updated 8 months ago
- 【grps接入trtllm】通过GPRS+TensorRT-LLM+Tokenizers.cpp实现纯C++版高性能OpenAI LLM服务,支持chat和function call模式,支持ai agent,支持分布式多卡推理,支持多模态,支 持gradio聊天界面。☆92Updated 2 weeks ago
- This is a repo for my NanoGPT Pytorch2.0 Implementation when torch2.0 released soon, faster and simpler, a good tutorial learning GPT.☆60Updated 9 months ago
- Mixed precision inference by Tensorrt-LLM☆93Updated 3 weeks ago
- 教你只用最基本的python语法和numpy一步步实现深度学习框架☆120Updated 3 months ago
- A collection of papers related to knowledge fusion☆63Updated last month
- 打造首个开源版的KimiChat!☆137Updated this week
- Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models☆129Updated 2 weeks ago
- Ein multimodaler, multi-intelligenter Entwicklungsrahmen☆56Updated last week
- Harnessing the Power of AI to Navigate the Information Age – Uncovering Truth, Promoting Transparency, and Championing Fact-Based Discour…☆208Updated last year
- 保险行业回访外呼机器人☆74Updated last year
- Aiming to build the most comprehensive machine learning blog.☆153Updated this week
- A Contextual RAG Bot Framework☆107Updated 3 weeks ago
- ☆148Updated 6 months ago
- High performance rank executor for advertisement and recommendation system, implemented in C/C++ and support ensembled into Java/Scala ho…☆99Updated 8 months ago
- This includes the original implementation of CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control.☆66Updated last month
- 本项目展示了如何利用 GPT 自动化检索仓库内的文件(如 PDF、XLS、Word 等)并完成多模态任务。可将家庭摄像头的视频帧送入仓库,可以自动化判断家庭是否危险的事情(利用大模型对世界的理解力)。☆85Updated 3 months ago
- 接地气的大模型工程,争取成为一本大模型实战百科全书☆17Updated last year
- Emotion text classification using Llama3-8b with LoRA and FlashAttention. Based on LLaMA-Factory.☆48Updated 3 months ago
- ☆84Updated 3 weeks ago
- an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs☆45Updated 3 weeks ago
- [NeurIPS 2024] EffiBench: Benchmarking the Efficiency of Automatically Generated Code☆57Updated last month
- MIXQ: Taming Dynamic Outliers in Mixed-Precision Quantization by Online Prediction☆81Updated 3 weeks ago
- MPLSandbox is an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler a…☆20Updated this week
- 【深度学习模型部署框架】支持tf/torch/trt/trtllm/vllm以及更多nn框架,支持dynamic batching、streaming模式,支持python/c++双语言,可限制,可拓展,高性能。帮助用户快速地将模型部署到线上,并通过http/rpc接口方式…☆165Updated last week
- HeFlwr: Federated Learning for Heterogeneous Devices☆130Updated last month