kibae / onnxruntime-serverLinks
ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.
☆162Updated last month
Alternatives and similar repositories for onnxruntime-server
Users that are interested in onnxruntime-server are comparing it to the libraries listed below
Sorting:
- pg_onnx: ONNX Runtime integrated with PostgreSQL. Perform ML inference with data in your database.☆49Updated last month
- Tiny configuration for Triton Inference Server☆45Updated 6 months ago
- Step by step explanation/tutorial of llama2.c☆222Updated last year
- A tool for manual conversion of BGE-M3 models with preserved trainable variables and direct control over model outputs.☆42Updated 5 months ago
- A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.☆75Updated this week
- Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.☆44Updated this week
- Extension of Langchain for RAG. Easy benchmarking, multiple retrievals, reranker, time-aware RAG, and so on...☆281Updated last year
- Gugugo: 한국어 오픈소스 번역 모델 프로젝트☆81Updated last year
- Simple example of FastAPI + Celery + Triton for benchmarking☆64Updated 2 years ago
- Simple example of FastAPI + gRPC AsyncIO + Triton☆65Updated 2 years ago
- llama3.cuda is a pure C/CUDA implementation for Llama 3 model.☆335Updated 2 months ago
- Korean SAT leader board☆167Updated 4 months ago
- Weak Labeling (NER) using ChatGPT☆38Updated 2 years ago
- Building AI agent with hyperpocket tool in a flash☆49Updated 3 months ago
- ☆62Updated 2 months ago
- Unofficial API for CLOVA X☆37Updated last year
- This is a Korean OCR Python code using the Pororo library☆78Updated 2 years ago
- Ailoy is a developer-friendly library that simplifies building and deploying AI agents and LLM-based applications.☆95Updated this week
- Newsletter bot for 🤗 Daily Papers☆125Updated this week
- 42dot LLM consists of a pre-trained language model, 42dot LLM-PLM, and a fine-tuned model, 42dot LLM-SFT, which is trained to respond to …☆130Updated last year
- A Toolkit to Help Optimize Onnx Model☆174Updated last week
- ☆48Updated last year
- Build complex LLM Applications with Python Dictionary☆40Updated 9 months ago
- Ko-Arena-Hard-Auto: An automatic LLM benchmark for Korean☆23Updated 2 months ago
- ☆19Updated 2 months ago
- 1-Click is all you need.☆62Updated last year
- ☆271Updated last month
- Python Project Template☆68Updated 2 years ago
- ☆36Updated 4 months ago
- Official repository for EXAONE 3.5 built by LG AI Research☆194Updated 6 months ago