ninehills / llm-inference-benchmarkLinks

LLM Inference benchmark

☆428

Alternatives and similar repositories for llm-inference-benchmark

Users that are interested in llm-inference-benchmark are comparing it to the libraries listed below

Sorting:

Tencent / KsanaLLM
☆508Updated last month
NascentCore / llm-numbers-cn
中文版 llm-numbers
☆125Updated last year
alibaba / rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
☆903Updated this week
inferflow / inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
☆249Updated last year
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆267Updated 2 months ago
hpcaitech / SwiftInfer
Efficient AI Inference & Serving
☆477Updated last year
alipay / PainlessInferenceAcceleration
Accelerate inference without tears
☆361Updated 2 weeks ago
intel / xFasterTransformer
☆430Updated last month
FlagOpen / FlagScale
FlagScale is a large model toolkit based on open-sourced projects.
☆364Updated last week
vectorch-ai / ScaleLLM
A high-performance inference system for large language models, designed for production environments.
☆479Updated 3 weeks ago
alibaba / ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
☆433Updated last week
mindspore-lab / mindformers
☆175Updated this week
THUDM / FasterTransformer
Transformer related optimization, including BERT, GPT
☆39Updated 2 years ago
InternLM / InternEvo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…
☆411Updated 2 months ago
alibaba / Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
☆659Updated last year
OpenPPL / ppl.llm.serving
☆129Updated 10 months ago
multimodal-art-projection / MAP-NEO
☆964Updated 8 months ago
pandada8 / llm-inference-benchmark
LLM 推理服务性能测试
☆44Updated last year
THUDM / LongBench
LongBench v2 and LongBench (ACL 25'&24')
☆1,005Updated 9 months ago
antgroup / glake
GLake: optimizing GPU memory management and IO transmission.
☆486Updated 7 months ago
david-xinyuwei / david-share
☆360Updated this week
sgl-project / sgl-learning-materials
Materials for learning SGLang
☆618Updated last month
HIT-SCIR / Chinese-Mixtral-8x7B
中文Mixtral-8x7B（Chinese-Mixtral-8x7B）
☆655Updated last year
DeepLink-org / dlinfer
☆64Updated last week
HFAiLab / hai-platform
一种任务级GPU算力分时调度的高性能深度学习训练平台
☆707Updated 2 years ago
hahnyuan / LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…
☆568Updated last year
LLMServe / DistServe
Disaggregated serving system for Large Language Models (LLMs).
☆709Updated 6 months ago
onejune2018 / Awesome-LLM-Eval
Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs…
☆574Updated 2 months ago
microsoft / MInference
[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention…
☆1,143Updated last month
OpenBMB / UltraEval
[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.
☆251Updated last year