Oneflow-Inc / oneflow-mluLinks

☆7

Alternatives and similar repositories for oneflow-mlu

Users that are interested in oneflow-mlu are comparing it to the libraries listed below

Sorting:

Oneflow-Inc / oneflow_convert
OneFlow->ONNX
☆43Updated 2 years ago
Adlik / model_zoo
☆11Updated last year
Oneflow-Inc / oneflow-xrt
☆23Updated 2 years ago
OpenPPL / ppl.nn.llm
☆139Updated last year
OpenPPL / ppl.kernel.cuda
☆36Updated 7 months ago
OpenPPL / ppl.llm.kernel.cuda
☆148Updated 4 months ago
OpenPPL / ppl.kernel.cpu
☆17Updated last year
MARD1NO / CUDA-PPT
☆93Updated 2 months ago
OpenPPL / ppl.pmx
☆58Updated 6 months ago
OpenPPL / ppl.llm.serving
☆127Updated 5 months ago
OpenPPL / CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆83Updated 2 years ago
Oneflow-Inc / oneflow-documentation
oneflow documentation
☆69Updated 11 months ago
weishengying / cute_gemm
☆14Updated 9 months ago
weishengying / cutlass_flash_atten_fp8
使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention
☆68Updated 9 months ago
Bruce-Lee-LY / flash_attention_inference
Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.
☆37Updated 3 months ago
void-main / FasterTransformer
Transformer related optimization, including BERT, GPT
☆59Updated last year
BBuf / how-to-optimize-gemm
☆96Updated 3 years ago
yester31 / Cutlass_EX
study of cutlass
☆21Updated 6 months ago
AyakaGEMM / Hands-on-GEMM
☆134Updated last year
billmuch / matmul_perf_test
☆14Updated 3 years ago
DeepLink-org / AIChipBenchmark
☆26Updated last month
PaddlePaddle / CINN
Compiler Infrastructure for Neural Networks
☆145Updated last year
tpoisonooo / chgemm
symmetric int8 gemm
☆66Updated 4 years ago
zeroine / cutlass-cute-sample
☆33Updated last year
Cambricon / torch_mlu
☆25Updated 2 months ago
Oneflow-Inc / oneflow-lite
☆18Updated last year
tlc-pack / cutlass_fpA_intB_gemm
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
☆92Updated last week
OpenPPL / ppl.common
Common libraries for PPL projects
☆29Updated 2 months ago
YellowOldOdd / SDBI
Simple Dynamic Batching Inference
☆145Updated 3 years ago
Cambricon / mlu-ops
Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .
☆122Updated last week