Deep-Spark / DeepSparkInferenceLinks

DeepSparkInference has selected 48 inference model examples, covering fields such as computer vision, natural language processing, and speech recognition. Subsequent phases will gradually expand to more AI fields.

☆21

Alternatives and similar repositories for DeepSparkInference

Users that are interested in DeepSparkInference are comparing it to the libraries listed below

Sorting:

Deep-Spark / DeepSpark
The DeepSpark open platform selects hundreds of open source application algorithms and models that are deeply coupled with industrial app…
☆44Updated last month
dingyuqing05 / trt2022_wenet
☆72Updated 2 years ago
inisis / OnnxLLM
Large Language Model Onnx Inference Framework
☆36Updated 6 months ago
bug-developer021 / YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
☆52Updated last year
luchangli03 / onnxsim_large_model
simplify >2GB large onnx model
☆61Updated 8 months ago
zzk0 / triton
Triton Inferece Server Model Config and Client Scripts
☆32Updated 3 years ago
TRT2022 / ControlNet_TensorRT
天池 NVIDIA TensorRT Hackathon 2023 —— 生成式AI模型优化赛初赛第三名方案
☆49Updated last year
BBuf / onnx_learn
☆99Updated 4 years ago
tsingmicro-toolchain / OnnxSlim
A Toolkit to Help Optimize Large Onnx Model
☆157Updated last year
FeiGeChuanShu / trt2023
NVIDIA TensorRT Hackathon 2023复赛选题：通义千问Qwen-7B用TensorRT-LLM模型搭建及优化
☆42Updated last year
Tlntin / trt2023
☆26Updated last year
torchpipe / torchpipe
Serving Inside Pytorch
☆163Updated last week
Deep-Spark / DeepSparkHub
DeepSparkHub selects hundreds of application algorithms and models, covering various fields of AI and general-purpose computing, to suppo…
☆65Updated last month
Rayrtfr / FasterTransformer
Transformer related optimization, including BERT, GPT
☆17Updated 2 years ago
wangzyon / pyInfer
async inference for machine learning model
☆26Updated 2 years ago
wangzhaode / mnn-stable-diffusion
stable diffusion using mnn
☆66Updated last year
TRT2022 / trtllm-llama
☢️ TensorRT 2023复赛——基于TensorRT-LLM的Llama模型推断加速优化
☆50Updated last year
hpc203 / yolov7-detect-face-onnxrun-cpp-py
使用ONNXRuntime部署YOLOV7人脸+关键点检测，包含C++和Python两个版本的程序
☆51Updated 2 years ago
col-in-coding / Tensorrt-CV
Using TensorRT for Inference Model Deployment.
☆49Updated last year
yvonwin / qwen2.cpp
qwen2 and llama3 cpp implementation
☆45Updated last year
Tlntin / ChatGLM2-6B-TensorRT
☆90Updated 2 years ago
TRT2022 / MST-plus-plus-TensorRT
TensorRT 2022复赛方案：首个基于Transformer的图像重建模型MST++的TensorRT模型推断优化
☆140Updated 3 years ago
Oldpan / DeployIsAllYouNeed
☆121Updated 2 years ago
ZHEQIUSHUI / CLIP-ONNX-AX650-CPP
c++实现的clip推理，模型有一点点改动，但是不大，改动和导出模型的代码可以在readme里找到，模型文件都在Releases里，包括AX650的模型。新增支持ChineseCLIP
☆30Updated last month
cvdong / YOLO_TRT_SIM
高效部署：YOLO X, V3, V4, V5, V6, V7, V8, EdgeYOLO TRT推理 ™️ ,前后处理均由CUDA核函数实现 CPP/CUDA🚀
☆49Updated 2 years ago
MAhaitao999 / Yolov3_Dynamic_Batch_TensorRT_Triton
将Yolov3模型转成可以进行动态Batch的TensorRT推理以及Triton Inference Serving上部署的TensorRT模型
☆28Updated 4 years ago
leoluopy / autotvm_tutorial
autoTVM神经网络推理代码优化搜索演示，基于tvm编译开源模型centerface，并使用autoTVM搜索最优推理代码，　最终部署编译为c++代码，演示平台是cuda，可以是其他平台，例如树莓派，安卓手机，苹果手机．Thi is a demonstration of …
☆27Updated 4 years ago
DataXujing / TensorRT-LLM-ChatGLM3
大模型部署实战：TensorRT-LLM, Triton Inference Server, vLLM
☆26Updated last year
sophgo / ChatGLM2-TPU
run ChatGLM2-6B in BM1684X
☆49Updated last year
luchangli03 / export_llama_to_onnx
export llama to onnx
☆131Updated 7 months ago