guidance-ai / llgtrtLinks

TensorRT-LLM server with Structured Outputs (JSON) built with Rust

☆59

Alternatives and similar repositories for llgtrt

Users that are interested in llgtrt are comparing it to the libraries listed below

Sorting:

beowolx / rensa
High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…
☆210Updated 3 weeks ago
dottxt-ai / outlines-core
Faster structured generation
☆254Updated last week
MinishLab / model2vec-rs
Official Rust Implementation of Model2Vec
☆138Updated 3 weeks ago
Oxen-AI / GRPO-With-Cargo-Feedback
This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback
☆107Updated 7 months ago
Dan-wanna-M / kbnf
A high-performance constrained decoding engine based on context free grammar in Rust
☆55Updated 5 months ago
guidance-ai / llguidance
Super-fast Structured Outputs
☆561Updated last week
Vaibhavs10 / fast-llm.rs
☆139Updated last year
fbilhaut / gline-rs
Inference engine for GLiNER models, in Rust
☆74Updated this week
LaurentMazare / mamba.rs
☆134Updated last year
EricLBuehler / candle-vllm
Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.
☆496Updated this week
ShelbyJenkins / llm_client
The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from Probabilistic LLM Vibes
☆236Updated 2 months ago
atoma-network / atoma-infer
Fast serverless LLM inference, in Rust.
☆94Updated 7 months ago
nath1295 / MLX-Textgen
A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.
☆97Updated 3 months ago
Dan-wanna-M / formatron
Formatron empowers everyone to control the format of language models' output with minimal overhead.
☆226Updated 4 months ago
zanussbaum / surfgrad
webgpu autograd library
☆33Updated 5 months ago
AmineDiro / cria
OpenAI compatible API for serving LLAMA-2 model
☆218Updated 2 years ago
leo-du / llama2.rs
Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust
☆39Updated 2 years ago
chenwanqq / candle-llava
implement llava using candle
☆15Updated last year
oxideai / mlx-rs
Unofficial Rust bindings to Apple's mlx framework
☆202Updated 2 weeks ago
TheProxyCompany / proxy-structuring-engine
Guaranteed Structured Output from any Language Model via Hierarchical State Machines
☆145Updated 2 weeks ago
krypticmouse / DSRs
Performance centered DSPy rewrite to(not port) Rust
☆160Updated this week
mzbac / mlx.voxtral
☆13Updated 2 months ago
inferx-net / inferx
InferX: Inference as a Service Platform
☆136Updated this week
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆139Updated last year
huggingface / xet-core
xet client tech, used in huggingface_hub
☆302Updated this week
spyglass-search / memex
Super-simple, fully Rust powered "memory" (doc store + semantic search) for LLM projects, semantic search, etc.
☆62Updated 2 years ago
xetdata / xet-core
Rust crates for XetHub
☆70Updated last year
Noveum / ai-gateway
Built for demanding AI workflows, this gateway offers low-latency, provider-agnostic access, ensuring your AI applications run smoothly a…
☆78Updated 4 months ago
huggingface / inference-benchmarker
Inference server benchmarking tool
☆118Updated 3 weeks ago
Preemo-Inc / text-generation-inference
☆197Updated last year