hyperai/vllm-cn

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hyperai/vllm-cn)

hyperai / vllm-cn

vLLM Documentation in Chinese Simplified / vLLM 中文文档

☆189

Alternatives and similar repositories for vllm-cn

Users that are interested in vllm-cn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

vedaldi / micro_llama
View on GitHub
A tiny, didactical implementation of LLAMA 3
☆42Dec 2, 2024Updated last year
submissions-ai / Rate-Perception-Optimized-Preprocessing-for-Video-Coding
View on GitHub
☆16Feb 22, 2025Updated last year
PaddlePaddle / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆21Updated this week
RUC-GSAI / Llama-3-SynE
View on GitHub
Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …
☆40May 31, 2025Updated last year
jiaming-wang / Super-Resolution-And-Person-re-identification-Benchmarks
View on GitHub
A collection of state-of-the-art Super-Resolution/Person re-identification architectures.
☆18Nov 9, 2020Updated 5 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
Infatoshi / grokking-megakernels
View on GitHub
Companion code for Grokking Megakernels: fuse an entire LLM forward pass into a single CUDA kernel
☆23Feb 9, 2026Updated 5 months ago
sumitsidana / recsys_challenge_2020
View on GitHub
This repository contains the code for 4th place solution for approach to RecSys Challenge 2020.
☆18Sep 26, 2020Updated 5 years ago
Syencil / ncnn-android-projects
View on GitHub
Android Demon of mobilev2-yolo5s and retinaface
☆17Nov 14, 2020Updated 5 years ago
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆87,317Updated this week
curtis18 / swoft-pgsql
View on GitHub
Swoft Postgresql Component
☆10Nov 4, 2019Updated 6 years ago
JamyDon / PLM-based-CGEC-Model-Ensemble
View on GitHub
[ACL 2023] Are Pre-trained Language Models Useful for Model Ensemble in Chinese Grammatical Error Correction?
☆10Dec 15, 2025Updated 7 months ago
THU-KEG / DeepPrune
View on GitHub
🌿 DeepPrune: Parallel Scaling without Inter-trace Redundancy
☆21Apr 20, 2026Updated 3 months ago
sgl-project / sglang
View on GitHub
SGLang is a high-performance serving framework for large language models and multimodal models.
☆30,854Updated this week
winter1203 / vllm_GOT2_OCR
View on GitHub
Accelerating GOT-OCRv2 with VLLM
☆10Nov 15, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
toyaix / triton-runner
View on GitHub
Multi-Level Triton Runner supporting Python, IR, PTX, AMDGCN, cubin and hasco.
☆98May 8, 2026Updated 2 months ago
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆6,053Updated this week
coze-dev / cozeloop-js
View on GitHub
The JavaScript SDK for CozeLoop 🧭
☆20Apr 22, 2026Updated 3 months ago
OpenMOSE / RWKV-Infer
View on GitHub
A large-scale RWKV v7(World, PRWKV, Hybrid-RWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy…
☆51Oct 21, 2025Updated 9 months ago
modelscope / evalscope
View on GitHub
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
☆3,159Updated this week
a-r-r-o-w / productionizing-diffusion
View on GitHub
Optimizing diffusion for production-ready speeds
☆40Jan 10, 2026Updated 6 months ago
coze-dev / coze-android
View on GitHub
☆15Jan 30, 2026Updated 5 months ago
chaoql / CCF-AIOps-Code
View on GitHub
2024CCF国际AIOps挑战赛-赛道二（GLM4）：基于检索增强的运维知识问答挑战赛解决方案分享。
☆14Jul 5, 2024Updated 2 years ago
intel / xFasterTransformer
View on GitHub
☆435Sep 18, 2025Updated 10 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Bruce-Lee-LY / cutlass_gemm
View on GitHub
Multiple GEMM operators are constructed with cutlass to support LLM inference.
☆20Aug 3, 2025Updated 11 months ago
l-sf / LightTrack_openvino
View on GitHub
本仓库基于 Intel OpenVINO Toolkit 部署 LightTrack 跟踪算法，包含 Python、C++ 两种语言的推理代码.
☆21Nov 2, 2023Updated 2 years ago
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆6,063Updated this week
abachaa / VQA-Med-2021
View on GitHub
VQA-Med 2021
☆24May 13, 2026Updated 2 months ago
AmbientTalk / wePoker
View on GitHub
wePoker is a multi-player poker game for Android
☆12Mar 20, 2013Updated 13 years ago
cqu20160901 / yolov8seg_rknn_Cplusplus
View on GitHub
yolov8seg 瑞芯微 rknn 板端 C++部署，使用平台 rk3588。
☆31May 8, 2024Updated 2 years ago
Kingsley-Cheng / UCAS
View on GitHub
Courses in UCAS
☆14Jun 12, 2023Updated 3 years ago
lqtrung1998 / mwp_cot_design
View on GitHub
☆14Oct 11, 2023Updated 2 years ago
kingname / SifouSource
View on GitHub
Python 业务开发常见错误案例集配套源代码
☆10Dec 19, 2020Updated 5 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
yynil / RWKVinLLAMA
View on GitHub
☆17Jan 1, 2025Updated last year
hahnyuan / LLM-Viewer
View on GitHub
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…
☆665Sep 11, 2024Updated last year
THU-KEG / LLMAEL
View on GitHub
[CIKM 2025] LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking
☆17Sep 6, 2025Updated 10 months ago
liguodongiot / llm-action
View on GitHub
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
☆24,813Jul 19, 2026Updated last week
vllm-project / llm-compressor
View on GitHub
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
☆3,593Updated this week
sgl-project / sgl-learning-materials
View on GitHub
Materials for learning SGLang
☆861Jan 5, 2026Updated 6 months ago
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,679Apr 26, 2026Updated 3 months ago