octoml / triton-client-rsLinks

A client library in Rust for Nvidia Triton.

☆30

Alternatives and similar repositories for triton-client-rs

Users that are interested in triton-client-rs are comparing it to the libraries listed below

Sorting:

nbigaouette / onnxruntime-rs
Rust wrapper for Microsoft's ONNX Runtime (version 1.8)
☆301Updated last year
kyutai-labs / kaudio
Rust crate for some audio utilities
☆26Updated 4 months ago
Narsil / bindgen_cuda
☆23Updated 3 months ago
ssoudan / tch-m1
Example of tch-rs on M1
☆54Updated last year
EricLBuehler / candle-vllm
Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.
☆394Updated last week
FL33TW00D / steelix
Your one stop CLI for ONNX model analysis.
☆47Updated 2 years ago
huggingface / hf-hub
Rust client for the huggingface hub aiming for minimal subset of features over `huggingface-hub` python package
☆214Updated last month
wavey-ai / mel-spec
Rust library for whisper.cpp compatible Mel spectrograms
☆72Updated 2 months ago
atoma-network / atoma-infer
Fast serverless LLM inference, in Rust.
☆88Updated 4 months ago
tomsanbear / candle-einops
☆30Updated 7 months ago
EricLBuehler / candle-lora
Low rank adaptation (LoRA) for Candle.
☆152Updated 2 months ago
chenwanqq / candle-llava
implement llava using candle
☆15Updated last year
KGrewal1 / candle-optimisers
A collection of optimisers for use with candle
☆36Updated last month
coreylowman / llama-dfdx
LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!
☆108Updated last year
oddity-ai / async-cuda
Asynchronous CUDA for Rust.
☆34Updated 8 months ago
jafioti / dataflow
Dataflow is a data processing library, primarily for machine learning.
☆23Updated 2 years ago
yaman / fashion-clip-rs
A complete(grpc service and lib) Rust inference with multilingual embedding support. This version leverages the power of Rust for both GR…
☆39Updated 10 months ago
guillaume-be / rust-tokenizers
Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigram (…
☆321Updated last year
EndlessReform / fish-speech.rs
A Fish Speech implementation in Rust, with Candle.rs
☆94Updated last month
eugenehp / gpu-fft
GPU based FFT written in Rust and CubeCL
☆23Updated last month
Dan-wanna-M / kbnf
A high-performance constrained decoding engine based on context free grammar in Rust
☆54Updated last month
modal-labs / tracing-perfetto-sdk
An in-process trace collector using the Rust tracing framework and the Perfetto C++ SDK
☆13Updated 3 months ago
haixuanTao / bert-onnx-rs-server
A Demo server serving Bert through ONNX with GPU written in Rust with <3
☆40Updated 3 years ago
sarah-quinones / gemm
☆89Updated 6 months ago
pixelspark / poly
A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust
☆80Updated last year
Narsil / bloomserver
☆39Updated 2 years ago
triton-inference-server / pytorch_backend
The Triton backend for the PyTorch TorchScript models.
☆156Updated last week
LaurentMazare / tch-ext
Sample Python extension using Rust/PyO3/tch to interact with PyTorch
☆37Updated last year
mokeyish / candle-ext
An extension library to Candle that provides PyTorch functions not currently available in Candle
☆40Updated last year
joeyballentine / ESRGAN-candle-rs
ESRGAN implemented in rust with candle
☆17Updated last year