guidance-ai / llgtrtLinks
TensorRT-LLM server with Structured Outputs (JSON) built with Rust
☆60Updated 5 months ago
Alternatives and similar repositories for llgtrt
Users that are interested in llgtrt are comparing it to the libraries listed below
Sorting:
- Faster structured generation☆252Updated 4 months ago
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆204Updated 2 months ago
- Official Rust Implementation of Model2Vec☆137Updated last week
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆105Updated 6 months ago
- Inference engine for GLiNER models, in Rust☆70Updated 3 months ago
- Super-fast Structured Outputs☆533Updated last week
- A high-performance constrained decoding engine based on context free grammar in Rust☆55Updated 4 months ago
- Fast serverless LLM inference, in Rust.☆93Updated 7 months ago
- Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.☆473Updated last week
- ☆33Updated 10 months ago
- ☆133Updated last year
- implement llava using candle☆15Updated last year
- ☆139Updated last year
- webgpu autograd library☆32Updated 4 months ago
- Formatron empowers everyone to control the format of language models' output with minimal overhead.☆225Updated 3 months ago
- Super-simple, fully Rust powered "memory" (doc store + semantic search) for LLM projects, semantic search, etc.☆62Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆95Updated 3 months ago
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆146Updated last week
- A simple, CUDA or CPU powered, library for creating vector embeddings using Candle and models from Hugging Face☆41Updated last year
- A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.☆74Updated 7 months ago
- The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from Probabilistic LLM Vibes☆235Updated last month
- Unofficial Rust bindings to Apple's mlx framework☆192Updated last week
- A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust☆79Updated last year
- ⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.☆145Updated last year
- A complete(grpc service and lib) Rust inference with multilingual embedding support. This version leverages the power of Rust for both GR…☆39Updated last year
- Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from R…☆485Updated this week
- High-level, optionally asynchronous Rust bindings to llama.cpp☆229Updated last year
- Unofficial python bindings for the rust llm library. 🐍❤️🦀☆76Updated 2 years ago
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆125Updated 3 months ago
- A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.☆163Updated last week