guidance-ai / llgtrt
TensorRT-LLM server with Structured Outputs (JSON) built with Rust
☆35Updated this week
Alternatives and similar repositories for llgtrt:
Users that are interested in llgtrt are comparing it to the libraries listed below
- Super-fast Structured Outputs☆114Updated this week
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆78Updated this week
- Faster structured generation☆177Updated this week
- A high-performance constrained decoding engine based on context free grammar in Rust☆46Updated last month
- A simple, CUDA or CPU powered, library for creating vector embeddings using Candle and models from Hugging Face☆32Updated 9 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆72Updated 2 months ago
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆37Updated last year
- Tensor library for Zig☆11Updated 3 months ago
- Structured outputs for LLMs☆37Updated 7 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆53Updated last year
- ☆38Updated 11 months ago
- Super-simple, fully Rust powered "memory" (doc store + semantic search) for LLM projects, semantic search, etc.☆57Updated last year
- ☆53Updated 6 months ago
- ☆19Updated 4 months ago
- ☆57Updated last year
- Rust implementation of Huggingface transformers pipelines using onnxruntime backend with bindings to C# and C.☆36Updated last year
- Implementing the BitNet model in Rust☆30Updated 10 months ago
- Rust implementation of Surya☆57Updated 3 weeks ago
- Modular Rust transformer/LLM library using Candle☆36Updated 9 months ago
- Your one stop CLI for ONNX model analysis.☆47Updated 2 years ago
- A SQLite extension for generating text embeddings from remote APIs (OpenAI, Nomic, Ollama, llamafile...)☆108Updated 3 months ago
- llm_utils: Basic LLM tools, best practices, and minimal abstraction.☆41Updated this week
- Plug n Play GBNF Compiler for llama.cpp☆24Updated last year
- Unofficial Rust bindings to Apple's mlx framework☆126Updated this week
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- Editor with LLM generation tree exploration☆62Updated last week
- Distributed Inference for mlx LLm☆82Updated 6 months ago
- A relatively basic implementation of RWKV in Rust written by someone with very little math and ML knowledge. Supports 32, 8 and 4 bit eva…☆93Updated last year
- ☆125Updated 9 months ago