guidance-ai / llguidanceLinks
Super-fast Structured Outputs
☆473Updated 3 weeks ago
Alternatives and similar repositories for llguidance
Users that are interested in llguidance are comparing it to the libraries listed below
Sorting:
- Faster structured generation☆252Updated 3 months ago
- TensorRT-LLM server with Structured Outputs (JSON) built with Rust☆58Updated 4 months ago
- ☆413Updated 2 weeks ago
- Formatron empowers everyone to control the format of language models' output with minimal overhead.☆223Updated 3 months ago
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆201Updated last month
- A high-performance constrained decoding engine based on context free grammar in Rust☆56Updated 3 months ago
- Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.☆453Updated this week
- Fast, Flexible and Portable Structured Generation☆1,233Updated this week
- Comparison of Language Model Inference Engines☆229Updated 8 months ago
- multilspy is a lsp client library in Python intended to be used to build applications around language servers.☆434Updated last week
- Simple UI for debugging correlations of text embeddings☆291Updated 3 months ago
- Inference server benchmarking tool☆98Updated 4 months ago
- ☆223Updated 2 months ago
- Fast parallel LLM inference for MLX☆216Updated last year
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆147Updated 2 months ago
- ☆467Updated last year
- ☆155Updated 9 months ago
- LLM-based code completion engine☆193Updated 7 months ago
- Late Interaction Models Training & Retrieval☆576Updated this week
- ☆231Updated 2 months ago
- ☆135Updated 3 weeks ago
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆145Updated 3 months ago
- XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.☆163Updated 4 months ago
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆242Updated 3 months ago
- Official inference library for pre-processing of Mistral models☆790Updated last week
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆309Updated this week
- A Lightweight Library for AI Observability☆251Updated 6 months ago
- Fast Semantic Text Deduplication & Filtering☆800Updated last week
- Tutorial for building LLM router☆226Updated last year
- Manage scalable open LLM inference endpoints in Slurm clusters☆271Updated last year