duydvu / triton-inference-server-web-ui
Triton Inference Server Web UI
☆10Updated last year
Related projects ⓘ
Alternatives and complementary repositories for triton-inference-server-web-ui
- ☆52Updated last year
- OpenAI compatible API for TensorRT LLM triton backend☆177Updated 3 months ago
- A quantization algorithm for LLM☆101Updated 5 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆137Updated 2 months ago
- ☆100Updated 7 months ago
- Common source, scripts and utilities shared across all Triton repositories.☆62Updated this week
- The Triton backend for the ONNX Runtime.☆133Updated this week
- Whisper inference with TensorRT-LLM☆21Updated last year
- 端到端语音唤醒工具箱,从模型训练到模型推理。☆77Updated 2 months ago
- Universal cross-platform tokenizers binding to HF and sentencepiece☆274Updated last week
- ☆192Updated this week
- export llama to onnx☆97Updated 5 months ago
- Common source, scripts and utilities for creating Triton backends.☆295Updated this week
- ASR client for Triton ASR Service☆19Updated last month
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆74Updated last month
- onnxruntime pre-compiled libs☆81Updated this week
- ☆123Updated 11 months ago
- ☆32Updated 9 months ago
- simplify >2GB large onnx model☆44Updated 8 months ago
- ONNX Adapter for model-explorer☆25Updated last month
- ☆24Updated this week
- flow mirror models from JZX AI Labs☆40Updated last month
- Transformer related optimization, including BERT, GPT☆17Updated last year
- Triton Inferece Server Model Config and Client Scripts☆31Updated 2 years ago
- LLaMa/RWKV onnx models, quantization and testcase☆353Updated last year
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆66Updated last year
- run ChatGLM2-6B in BM1684X☆48Updated 8 months ago
- ☆23Updated last year
- ☆158Updated last month