ztxz16/fastllm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ztxz16/fastllm)

ztxz16 / fastllm

fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型，任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型，单并发20tps；INT4量化模型单并发30tps，多并发可达60+。

☆4,876

Alternatives and similar repositories for fastllm

Users that are interested in fastllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

li-plus / chatglm.cpp
View on GitHub
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
☆2,966Jul 31, 2024Updated 2 years ago
zai-org / ChatGLM2-6B
View on GitHub
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
☆15,541Jun 27, 2024Updated 2 years ago
InternLM / lmdeploy
View on GitHub
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
☆7,983Updated this week
wangzhaode / mnn-llm
View on GitHub
llm deploy project based mnn. This project has merged into MNN.
☆1,615Jan 20, 2025Updated last year
ymcui / Chinese-LLaMA-Alpaca
View on GitHub
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
☆18,944Apr 19, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
baichuan-inc / Baichuan-7B
View on GitHub
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
☆5,651Jul 18, 2024Updated 2 years ago
yangjianxin1 / Firefly
View on GitHub
Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、…
☆6,649Oct 24, 2024Updated last year
LianjiaTech / BELLE
View on GitHub
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
☆8,280Oct 16, 2024Updated last year
hiyouga / ChatGLM-Efficient-Tuning
View on GitHub
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
☆3,720Oct 12, 2023Updated 2 years ago
ModelTC / LightLLM
View on GitHub
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalabili…
☆4,202Updated this week
MegEngine / InferLLM
View on GitHub
a lightweight LLM model inference framework
☆751Apr 7, 2024Updated 2 years ago
zai-org / ChatGLM-6B
View on GitHub
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
☆41,012Jun 27, 2024Updated 2 years ago
chatchat-space / Langchain-Chatchat
View on GitHub
Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain…
☆38,492Nov 10, 2025Updated 8 months ago
zai-org / ChatGLM3
View on GitHub
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
☆13,664Jan 13, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
NVIDIA / FasterTransformer
View on GitHub
Transformer related optimization, including BERT, GPT
☆6,445Mar 27, 2024Updated 2 years ago
mymusise / ChatGLM-Tuning
View on GitHub
基于ChatGLM-6B + LoRA的Fintune方案
☆3,745Nov 25, 2023Updated 2 years ago
wenda-LLM / wenda
View on GitHub
闻达：一个LLM调用平台。目标为针对特定环境的高效内容生成，同时考虑个人和中小企业的计算资源局限性，以及知识安全和私密性问题
☆6,165Jan 23, 2025Updated last year
CVI-SZU / Linly
View on GitHub
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型；ChatFlow中文对话模型；中文OpenLLaMA模型；NLP预训练/指令微调数据集
☆3,046Apr 14, 2024Updated 2 years ago
OpenMOSS / MOSS
View on GitHub
An open-source tool-augmented conversational language model from Fudan University
☆12,206May 27, 2026Updated 2 months ago
Facico / Chinese-Vicuna
View on GitHub
Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca
☆4,118Apr 18, 2025Updated last year
ymcui / Chinese-LLaMA-Alpaca-2
View on GitHub
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
☆7,133Apr 19, 2026Updated 3 months ago
baichuan-inc / Baichuan2
View on GitHub
A series of large language models developed by Baichuan Intelligent Technology
☆4,089Nov 8, 2024Updated last year
baichuan-inc / Baichuan-13B
View on GitHub
A 13B large language model developed by Baichuan Intelligent Technology
☆2,931Sep 6, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
lm-sys / FastChat
View on GitHub
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,509May 1, 2026Updated 3 months ago
TigerResearch / TigerBot
View on GitHub
TigerBot: A multi-language multi-task LLM
☆2,260Dec 28, 2024Updated last year
NVIDIA / TensorRT-LLM
View on GitHub
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizat…
☆14,275Updated this week
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,591Updated this week
QwenLM / Qwen
View on GitHub
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
☆21,507Mar 5, 2026Updated 4 months ago
LC1332 / Luotuo-Chinese-LLM
View on GitHub
骆驼(Luotuo): Open Sourced Chinese Language Models. Developed by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子昂 @ 商汤科技
☆3,591Sep 3, 2023Updated 2 years ago
InternLM / InternLM
View on GitHub
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
☆7,255Oct 30, 2025Updated 9 months ago
zai-org / VisualGLM-6B
View on GitHub
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
☆4,157Aug 23, 2024Updated last year
hpcaitech / ColossalAI
View on GitHub
Making large AI models cheaper, faster and more accessible
☆41,426Jul 13, 2026Updated 2 weeks ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
hiyouga / LlamaFactory
View on GitHub
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
☆73,659Updated this week
AutoGPTQ / AutoGPTQ
View on GitHub
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆5,075Apr 11, 2025Updated last year
mlc-ai / mlc-llm
View on GitHub
Universal LLM Deployment Engine with ML Compilation
☆23,011Updated this week
hiyouga / FastEdit
View on GitHub
🩹Editing large language models within 10 seconds⚡
☆1,370Aug 13, 2023Updated 2 years ago
Jittor / JittorLLMs
View on GitHub
计图大模型推理库，具有高性能、配置要求低、中文支持好、可移植等特点
☆2,411Feb 22, 2025Updated last year
IDEA-CCNL / Fengshenbang-LM
View on GitHub
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系，成为中文AIGC和认知智能的基础设施。
☆4,125Jun 8, 2026Updated last month
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆87,865Updated this week