ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.
☆181Mar 6, 2026Updated 2 weeks ago
Alternatives and similar repositories for onnxruntime-server
Users that are interested in onnxruntime-server are comparing it to the libraries listed below
Sorting:
- An ASR toolkit with the freedom of topology☆10Dec 18, 2023Updated 2 years ago
- ONNX Serving is a project written with C++ to serve onnx-mlir compiled models with GRPC and other protocols.Benefiting from C++ implement…☆26Sep 17, 2025Updated 6 months ago
- Inference deployment of the llama3☆10Apr 21, 2024Updated last year
- Decoders from Kaldi using OpenFst☆34Jan 29, 2026Updated last month
- RKNN-YOLOV5-BatchInference-MultiThreadingYOLOV5多张图片多线程C++推理☆21Nov 6, 2023Updated 2 years ago
- Simple, High-speed inferencing for YOLOv11 with ONNXRuntime☆17Nov 4, 2024Updated last year
- ☆32Jul 23, 2024Updated last year
- Large Language Model Onnx Inference Framework☆34Nov 25, 2025Updated 3 months ago
- ☕️ A vscode extension for netron, support *.pdmodel, *.nb, *.onnx, *.pb, *.h5, *.tflite, *.pth, *.pt, *.mnn, *.param, etc.☆14Jun 4, 2023Updated 2 years ago
- llm deploy project based onnx.☆49Oct 9, 2024Updated last year
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆450Mar 13, 2026Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Jun 3, 2024Updated last year
- ☆28Oct 7, 2025Updated 5 months ago
- Automatically generate a world map showing where contributions to your repository are coming from.☆12Apr 11, 2024Updated last year
- Colab notebooks for Next-gen Kaldi☆30Oct 12, 2025Updated 5 months ago
- 使用ONNXRuntime部署Informative-Drawings生成素描画,包含C++和Python两个版本的程序☆14Sep 7, 2023Updated 2 years ago
- TensorRT depth-anything for anyone and anywhere☆14Jan 29, 2024Updated 2 years ago
- 用于学习GOT/Qwen/OnnxLLm☆53Oct 8, 2024Updated last year
- Model compression for ONNX☆99Mar 1, 2026Updated 2 weeks ago
- Provides an ensemble model to deploy a YoloV8 ONNX model to Triton☆41Oct 19, 2023Updated 2 years ago
- Automatic Speech Recognition at the University of Edinburgh.☆16Mar 14, 2021Updated 5 years ago
- ONNX Runtime tiny wrapper for openFrameworks☆15Jan 21, 2022Updated 4 years ago
- 瑞芯微芯片的rknn推理框架部署(yolo模型)☆14Jul 17, 2025Updated 8 months ago
- Uses the excellent silero VAD with onnxruntime C api for fast detection of audio segments with speech☆16Sep 20, 2024Updated last year
- 大模型API性能指标比较 - 深入分析TTFT、TPS等关键指标☆19Sep 12, 2024Updated last year
- stable diffusion using mnn☆66Sep 28, 2023Updated 2 years ago
- Dart plugin wrapping the Sherpa-ONNX runtime. Contains example for speech recognition with Flutter☆22Jan 3, 2025Updated last year
- Serving Inside Pytorch☆170Feb 3, 2026Updated last month
- ONNX-compatible DocShadow: High-Resolution Document Shadow Removal. Supports TensorRT 🚀☆25Sep 13, 2023Updated 2 years ago
- YoloV10 for a bare Raspberry Pi 4 or 5☆22Jun 21, 2024Updated last year
- onnxruntime pre-compiled libs☆173Mar 3, 2026Updated 2 weeks ago
- Explore LLM model deployment based on AXera's AI chips☆143Updated this week
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆216Sep 10, 2024Updated last year
- ☆13Feb 9, 2022Updated 4 years ago
- The Triton backend for TensorRT.☆86Mar 10, 2026Updated last week
- c++实现的clip推理,模型有一点点改动,但是不大,改动和导出模型的代码可以在readme里找到,模型文件都在Releases里,包括AX650的模型。新增支持ChineseCLIP☆31Jun 19, 2025Updated 9 months ago
- ☆12Jan 23, 2020Updated 6 years ago
- Flutter SQLite complete CRUD Operation tutorial☆11Sep 22, 2023Updated 2 years ago
- Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> O…☆33Aug 18, 2021Updated 4 years ago