kibae / onnxruntime-serverLinks

ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.

☆166

Alternatives and similar repositories for onnxruntime-server

Users that are interested in onnxruntime-server are comparing it to the libraries listed below

Sorting:

kibae / pg_onnx
pg_onnx: ONNX Runtime integrated with PostgreSQL. Perform ML inference with data in your database.
☆49Updated 2 months ago
rtzr / tritony
Tiny configuration for Triton Inference Server
☆45Updated 7 months ago
RahulSChand / llama2.c-for-dummies
Step by step explanation/tutorial of llama2.c
☆223Updated last year
Curt-Park / mnist-fastapi-celery-triton
Simple example of FastAPI + Celery + Triton for benchmarking
☆64Updated 3 years ago
sionic-ai / BGE-M3-Model-Converter
A tool for manual conversion of BGE-M3 models with preserved trainable variables and direct control over model outputs.
☆42Updated 6 months ago
Curt-Park / mnist-fastapi-aio-triton
Simple example of FastAPI + gRPC AsyncIO + Triton
☆67Updated 2 years ago
jwj7140 / Gugugo
Gugugo: 한국어 오픈소스 번역 모델 프로젝트
☆81Updated last year
sionic-ai / Llama4-Token-Editor
☆62Updated 3 weeks ago
dnotitia / smoothie-qwen
A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.
☆78Updated last month
mirusu400 / CLOVA-X
Unofficial API for CLOVA X
☆37Updated last year
Marker-Inc-Korea / RAGchain
Extension of Langchain for RAG. Easy benchmarking, multiple retrievals, reranker, time-aware RAG, and so on...
☆281Updated last year
nnstreamer / nnstreamer-example
Example applications of nnstreamer. Note that we have to enable the 'apptest" CI module in the near future.
☆81Updated 8 months ago
deep-diver / hf-daily-paper-newsletter
Newsletter bot for 🤗 Daily Papers
☆126Updated this week
triton-inference-server / tensorrt_backend
The Triton backend for TensorRT.
☆77Updated this week
wangkuiyi / huggingface-tokenizer-in-cxx
☆68Updated 2 years ago
sionic-ai / serverless-rag-mcp-server
☆36Updated 5 months ago
annotation-ai / python-project-template
Python Project Template
☆68Updated 3 years ago
42dot / 42dot_LLM
42dot LLM consists of a pre-trained language model, 42dot LLM-PLM, and a fine-tuned model, 42dot LLM-SFT, which is trained to respond to …
☆130Updated last year
sionic-ai / webgpu-llm-loader
A loader that lets you try running LLMs built for WebGPU.
☆29Updated last year
qwopqwop200 / ko-arena-hard-auto
Ko-Arena-Hard-Auto: An automatic LLM benchmark for Korean
☆23Updated 3 months ago
smarteasy / open-prompt
☆19Updated 3 months ago
vessl-ai / hyperpocket
Building AI agent with hyperpocket tool in a flash
☆50Updated 3 months ago
yvonwin / qwen2.cpp
qwen2 and llama3 cpp implementation
☆46Updated last year
SqueezeBits / Torch-TRTLLM
Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.
☆46Updated 3 weeks ago
Marker-Inc-Korea / Korean-SAT-LLM-Leaderboard
Korean SAT leader board
☆167Updated 5 months ago
triton-inference-server / onnxruntime_backend
The Triton backend for the ONNX Runtime.
☆156Updated this week
ML-TANGO / TANGO
public repo for TANGO (Target Aware No-code neural network Generation and Operation framework)
☆108Updated 2 months ago
LG-AI-EXAONE / EXAONE-3.5
Official repository for EXAONE 3.5 built by LG AI Research
☆200Updated 7 months ago
inisis / OnnxSlim
A Toolkit to Help Optimize Onnx Model
☆189Updated this week
inureyes / all-smi
Command-line utility for monitoring GPU hardware.
☆60Updated this week