wangzhaode / onnx-llmLinks

llm deploy project based onnx.

☆45

Alternatives and similar repositories for onnx-llm

Users that are interested in onnx-llm are comparing it to the libraries listed below

Sorting:

inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆36Updated 9 months ago
lovemefan / ggml-learning-notes
ggml学习笔记，ggml是一个机器学习的推理框架
☆18Updated last year
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆43Updated 2 years ago
daquexian / faster-rwkv
☆124Updated last year
ZHEQIUSHUI / CLIP-ONNX-AX650-CPP
c++实现的clip推理，模型有一点点改动，但是不大，改动和导出模型的代码可以在readme里找到，模型文件都在Releases里，包括AX650的模型。新增支持ChineseCLIP
☆30Updated 4 months ago
EdVince / llm-cpp
☆33Updated last year
yvonwin / qwen2.cpp
qwen2 and llama3 cpp implementation
☆47Updated last year
tsingmicro-toolchain / OnnxSlim
A Toolkit to Help Optimize Large Onnx Model
☆161Updated last year
ZHEQIUSHUI / SAM-ONNX-AX650-CPP
SAM and lama inpaint，包含QT的GUI交互界面，实现了交互式可实时显示结果的画点、画框进行SAM，然后通过进行Inpaint，具体操作看readme里的视频。
☆50Updated last year
torchpipe / torchpipe
Serving Inside Pytorch
☆163Updated 3 weeks ago
wangzhaode / mnn-stable-diffusion
stable diffusion using mnn
☆67Updated 2 years ago
ozanarmagan / clip_tokenizer_cpp
☆10Updated last year
guoguo1314 / llama3_learn.c
Inference deployment of the llama3
☆11Updated last year
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆63Updated 10 months ago
MollySophia / rwkv-qualcomm
Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK
☆85Updated last week
lrw04 / llama2.c-to-ncnn
A converter for llama2.c legacy models to ncnn models.
☆80Updated last year
AXERA-TECH / OWLVIT-ONNX-AX650-CPP
☆23Updated last year
jinmingyi1998 / opencl_kernels
An easy way to run, test, benchmark and tune OpenCL kernel files
☆24Updated 2 years ago
AXERA-TECH / CLIP-ONNX-AX650-CPP
☆27Updated 3 months ago
wangzyon / trt_learn
TensorRT encapsulation, learn, rewrite, practice.
☆29Updated 3 years ago
inisis / OnnxSlim
A Toolkit to Help Optimize Onnx Model
☆224Updated last week
FeiGeChuanShu / segment-anything-ncnn
an example of segment-anything infer by ncnn
☆124Updated 2 years ago
caibucai22 / awesome-cuda
Awesome code, projects, books, etc. related to CUDA
☆25Updated 2 months ago
HuPengsheet / EasyNN
EasyNN是一个面向教学而开发的神经网络推理框架，旨在让大家0基础也能自主完成推理框架编写！
☆33Updated last year
TRT2022 / trtllm-llama
☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化
☆50Updated 2 years ago
caiwanxianhust / FasterLLaMA
使用 CUDA C++ 实现的 llama 模型推理框架
☆62Updated 11 months ago
FeiGeChuanShu / ncnn_ppstructure
ppstructure deploy by ncnn
☆33Updated last year
AXERA-TECH / ax-llm
Explore LLM model deployment based on AXera's AI chips
☆117Updated last week
triple-Mu / HunyuanDiT-TensorRT-libtorch
HunyuanDiT with TensorRT and libtorch
☆18Updated last year
TRT2022 / ControlNet_TensorRT
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛初赛第三名方案
☆50Updated 2 years ago