kibae / onnxruntime-serverLinks
ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.
☆166Updated 3 weeks ago
Alternatives and similar repositories for onnxruntime-server
Users that are interested in onnxruntime-server are comparing it to the libraries listed below
Sorting:
- pg_onnx: ONNX Runtime integrated with PostgreSQL. Perform ML inference with data in your database.☆49Updated 3 months ago
- Step by step explanation/tutorial of llama2.c☆223Updated last year
- Tiny configuration for Triton Inference Server☆45Updated 7 months ago
- A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.☆80Updated last month
- Simple example of FastAPI + gRPC AsyncIO + Triton☆67Updated 3 years ago
- llama3.cuda is a pure C/CUDA implementation for Llama 3 model.☆343Updated 4 months ago
- Simple example of FastAPI + Celery + Triton for benchmarking☆64Updated 3 years ago
- Gugugo: 한국어 오픈소스 번역 모델 프로젝트☆81Updated last year
- A tool for manual conversion of BGE-M3 models with preserved trainable variables and direct control over model outputs.☆42Updated 7 months ago
- ☆68Updated 2 years ago
- Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.☆49Updated last month
- Korean SAT leader board☆166Updated last week
- Extension of Langchain for RAG. Easy benchmarking, multiple retrievals, reranker, time-aware RAG, and so on...☆282Updated last year
- Official repository for EXAONE 3.5 built by LG AI Research☆200Updated 8 months ago
- The Triton backend for TensorRT.☆77Updated last month
- Unofficial API for CLOVA X☆37Updated last year
- Command-line utility for monitoring GPU hardware.☆82Updated last week
- ☆62Updated last month
- A Toolkit to Help Optimize Onnx Model☆205Updated 2 weeks ago
- Ko-Arena-Hard-Auto: An automatic LLM benchmark for Korean☆23Updated 4 months ago
- The Triton backend for the ONNX Runtime.☆159Updated last week
- ☆293Updated last month
- Official repository for EXAONE Deep built by LG AI Research☆401Updated 3 months ago
- qwen2 and llama3 cpp implementation☆47Updated last year
- Scalable distributed log storage for strong consistency, total order, and high availability☆51Updated last week
- CLI tool to convert Terraform plan output into HTML visualizations☆20Updated this week
- ☆19Updated 4 months ago
- ☆36Updated 5 months ago
- Common source, scripts and utilities shared across all Triton repositories.☆76Updated last month
- Newsletter bot for 🤗 Daily Papers☆127Updated this week