second-state / meetupsLinks

☆70

Alternatives and similar repositories for meetups

Users that are interested in meetups are comparing it to the libraries listed below

Sorting:

OpenPPL / ppl.llm.serving
☆128Updated 8 months ago
DeepLink-org / ditorch
☆23Updated 7 months ago
Ascend / AscendSpeed
☆79Updated last year
Rayrtfr / FasterTransformer
Transformer related optimization, including BERT, GPT
☆17Updated 2 years ago
ninehills / llm-inference-benchmark
LLM Inference benchmark
☆426Updated last year
OpenBMB / CPM.cu
CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…
☆173Updated 3 weeks ago
infinigence / Semi-PD
A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.
☆104Updated 3 months ago
DeepLink-org / dlinfer
☆54Updated last week
feifeibear / LLMRoofline
Compare different hardware platforms via the Roofline Model for LLM inference tasks.
☆111Updated last year
modelbox-ai / modelbox
A high performance, high expansion, easy to use framework for AI application. 为AI应用的开发者提供一套统一的高性能、易用的编程框架，快速基于AI全栈服务、开发跨端边云的AI行业应用，支持GPU，…
☆157Updated last year
openmlsys / openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
☆130Updated 2 years ago
Deep-Spark / DeepSparkHub
DeepSparkHub selects hundreds of application algorithms and models, covering various fields of AI and general-purpose computing, to suppo…
☆65Updated this week
OpenPPL / ppl.nn.llm
☆140Updated last year
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆263Updated 2 weeks ago
AlibabaPAI / torchacc
PyTorch distributed training acceleration framework
☆51Updated last week
zhaochenyang20 / ModelServer
Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang
☆55Updated 9 months ago
OpenBMB / cpm_kernels
☆24Updated last year
inferflow / inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
☆246Updated last year
OpenPPL / ppl.pmx
☆59Updated 9 months ago
volcengine / veGiantModel
☆220Updated 2 years ago
zw0610 / zw0610.github.io
☆58Updated 5 years ago
FlagOpen / FlagCX
☆84Updated last week
bytedance / ByteMLPerf
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…
☆257Updated this week
bytedance / ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
☆475Updated last year
Oneflow-Inc / oneflow_convert
OneFlow->ONNX
☆43Updated 2 years ago
OpenCSGs / llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…
☆86Updated last year
Oneflow-Inc / DLPerf
DeepLearning Framework Performance Profiling Toolkit
☆287Updated 3 years ago
FlagTree / flagtree
FlagTree is a unified compiler for multiple AI chips, which is forked from triton-lang/triton.
☆72Updated this week
THUDM / FasterTransformer
Transformer related optimization, including BERT, GPT
☆39Updated 2 years ago
frankwang0818 / AI_compiler_development_guide
Free resource for the book AI Compiler Development Guide
☆46Updated 2 years ago