Oneflow-Inc / oneflow-liteLinks

☆18

Alternatives and similar repositories for oneflow-lite

Users that are interested in oneflow-lite are comparing it to the libraries listed below

Sorting:

Oneflow-Inc / oneflow_convert
OneFlow->ONNX
☆43Updated 2 years ago
Adlik / model_zoo
☆11Updated 2 weeks ago
Oneflow-Inc / oneflow-xrt
☆23Updated 2 years ago
Oneflow-Inc / one-fx
A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.
☆13Updated 2 years ago
Oneflow-Inc / serving
OneFlow Serving
☆20Updated 7 months ago
yester31 / Cutlass_EX
study of cutlass
☆22Updated last year
DeepLink-org / CVFusion
CVFusion is an open-source deep learning compiler to fuse the OpenCV operators.
☆32Updated 3 years ago
jimmy-evo / opencl_kernels
An easy way to run, test, benchmark and tune OpenCL kernel files
☆24Updated 2 years ago
OpenPPL / ppl.kernel.cuda
☆38Updated last year
BBuf / tensorrt-llm-moe
☆33Updated 10 months ago
flagos-ai / libtriton_jit
A Triton JIT runtime and ffi provider in C++
☆29Updated this week
Bruce-Lee-LY / cutlass_gemm
Multiple GEMM operators are constructed with cutlass to support LLM inference.
☆20Updated 4 months ago
pigirons / conv3x3_m1
This is a demo how to write a high performance convolution run on apple silicon
☆57Updated 3 years ago
tlc-pack / libflash_attn
Standalone Flash Attention v2 kernel without libtorch dependency
☆112Updated last year
InfiniTensor / RefactorGraph
分层解耦的深度学习推理引擎
☆76Updated 9 months ago
billmuch / matmul_perf_test
☆15Updated 3 years ago
OAID / TengineInferPipe
☆24Updated 2 years ago
inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆36Updated last week
Oneflow-Inc / vision
Datasets, Transforms and Models specific to Computer Vision
☆90Updated 2 years ago
ModelTC / pyvlova
Yet another Polyhedra Compiler for DeepLearning
☆19Updated 2 years ago
Syencil / Programming_Massively_Parallel_Processors
CUDA 6大并行计算模式代码与笔记
☆61Updated 5 years ago
LeiWang1999 / AutoGPTQ.tvm
GPTQ inference TVM kernel
☆40Updated last year
weishengying / cutlass_flash_atten_fp8
使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention
☆78Updated last year
OpenPPL / ppl.kernel.cpu
☆19Updated last year
tlc-pack / cutlass_fpA_intB_gemm
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
☆96Updated 2 months ago
OpenPPL / ppl.pmx
☆60Updated last year
BBuf / how-to-optimize-gemm
☆98Updated 4 years ago
KuangjuX / CUDAKernels
🎉My Collections of CUDA Kernels~
☆11Updated last year
OpenPPL / CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆84Updated 2 years ago
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆43Updated 2 years ago