duydvu / triton-inference-server-web-ui
Triton Inference Server Web UI
☆12Updated last year
Alternatives and similar repositories for triton-inference-server-web-ui:
Users that are interested in triton-inference-server-web-ui are comparing it to the libraries listed below
- Whisper inference with TensorRT-LLM☆21Updated last year
- OpenAI compatible API for TensorRT LLM triton backend☆186Updated 5 months ago
- ☆216Updated this week
- The Triton backend for the ONNX Runtime.☆136Updated this week
- Triton backend for https://github.com/OpenNMT/CTranslate2☆34Updated last year
- ☆168Updated 3 months ago
- Common source, scripts and utilities shared across all Triton repositories.☆65Updated this week
- A quantization algorithm for LLM☆108Updated 6 months ago
- Unofficial golang package for the Triton Inference Server(https://github.com/triton-inference-server/server)☆45Updated this week
- The Triton backend for TensorRT.☆68Updated this week
- ☆124Updated last year
- Running the F5-TTS by ONNX Runtime☆80Updated this week
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆71Updated last year
- Materials for learning SGLang☆168Updated last week
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.☆153Updated 3 months ago
- llm deploy project based onnx.☆30Updated 3 months ago
- Common source, scripts and utilities for creating Triton backends.☆305Updated this week
- 端到端语音唤醒工具箱,从模型训练到模型推理。☆93Updated 4 months ago
- ☆54Updated last year
- ASR client for Triton ASR Service☆23Updated last month
- Universal cross-platform tokenizers binding to HF and sentencepiece☆298Updated 2 months ago
- Transformer related optimization, including BERT, GPT☆17Updated last year
- OpenVINO backend for Triton.☆30Updated this week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆211Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆257Updated 3 months ago
- Comparison of Language Model Inference Engines☆202Updated last month
- ☆105Updated 9 months ago
- A cross platform implementation of Text-to-Speech based on ONNXRuntime.☆31Updated last year
- ONNX and TensorRT implementation of Whisper☆61Updated last year
- ☆41Updated 2 months ago