inisis / OnnxLLMLinks

Large Language Model Onnx Inference Framework

☆36

Alternatives and similar repositories for OnnxLLM

Users that are interested in OnnxLLM are comparing it to the libraries listed below

Sorting:

wangzhaode / onnx-llm
llm deploy project based onnx.
☆42Updated 9 months ago
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆42Updated last year
tsingmicro-toolchain / OnnxSlim
A Toolkit to Help Optimize Large Onnx Model
☆157Updated last year
AXERA-TECH / OWLVIT-ONNX-AX650-CPP
☆22Updated last year
torchpipe / torchpipe
Serving Inside Pytorch
☆163Updated 2 weeks ago
jinmingyi1998 / opencl_kernels
An easy way to run, test, benchmark and tune OpenCL kernel files
☆23Updated last year
Tlntin / trt2023
☆26Updated last year
TRT2022 / ControlNet_TensorRT
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛初赛第三名方案
☆49Updated last year
ZHEQIUSHUI / CLIP-ONNX-AX650-CPP
c++实现的clip推理，模型有一点点改动，但是不大，改动和导出模型的代码可以在readme里找到，模型文件都在Releases里，包括AX650的模型。新增支持ChineseCLIP
☆30Updated last month
MegEngine / mgeconvert
MegEngine到其他框架的转换器
☆70Updated 2 years ago
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆61Updated 8 months ago
wangzhaode / mnn-stable-diffusion
stable diffusion using mnn
☆66Updated last year
bug-developer021 / YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
☆52Updated last year
Oneflow-Inc / oneflow_convert
OneFlow->ONNX
☆43Updated 2 years ago
inisis / OnnxSlim
A Toolkit to Help Optimize Onnx Model
☆188Updated this week
caibucai22 / awesome-cuda
Awesome code, projects, books, etc. related to CUDA
☆20Updated 3 weeks ago
BaofengZan / GOT-OCRv2-onnx
用于学习GOT/Qwen/OnnxLLm
☆53Updated 9 months ago
ozanarmagan / clip_tokenizer_cpp
☆10Updated last year
BBuf / onnx_learn
☆99Updated 4 years ago
FeiGeChuanShu / segment-anything-ncnn
an example of segment-anything infer by ncnn
☆123Updated 2 years ago
AXERA-TECH / CLIP-ONNX-AX650-CPP
☆27Updated last month
ZHEQIUSHUI / SAM-ONNX-AX650-CPP
SAM and lama inpaint，包含QT的GUI交互界面，实现了交互式可实时显示结果的画点、画框进行SAM，然后通过进行Inpaint，具体操作看readme里的视频。
☆48Updated last year
TRT2022 / trtllm-llama
☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化
☆50Updated last year
wangzyon / trt_learn
TensorRT encapsulation, learn, rewrite, practice.
☆28Updated 2 years ago
daquexian / faster-rwkv
☆124Updated last year
wangzyon / pyInfer
async inference for machine learning model
☆26Updated 2 years ago
yuxiaoranyu / stable_diffusion_trt_triton
☆19Updated last year
OpenPPL / ppl.pmx
☆59Updated 8 months ago
Oneflow-Inc / oneflow-lite
☆18Updated last year
triple-Mu / HunyuanDiT-TensorRT-libtorch
HunyuanDiT with TensorRT and libtorch
☆17Updated last year