bentoml / llm-inference-handbookLinks
Everything you need to know about LLM inference
☆237Updated this week
Alternatives and similar repositories for llm-inference-handbook
Users that are interested in llm-inference-handbook are comparing it to the libraries listed below
Sorting:
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆225Updated 10 months ago
- Content addressable storage with excellent search☆352Updated last week
- High-Performance Implementation of OpenAI's TikToken.☆457Updated 3 months ago
- Parallel thinking for LLMs. Confidence‑gated, strategy‑driven, offline‑friendly☆257Updated last month
- Securely run AI-generated code in stateful sandboxes that run forever.☆221Updated 6 months ago
- Build Secure and Compliant AI agents and MCP Servers. YC W23☆152Updated 4 months ago
- LLM plugin for pulling content from Hacker News☆120Updated 5 months ago
- Animating R1's thoughts.☆385Updated 8 months ago
- Git Based Memory Storage for Conversational AI Agent☆664Updated last month
- Your filesystem as a vector database☆484Updated 6 months ago
- A comprehensive Model Context Protocol (MCP) server implementing the latest specification.☆334Updated 4 months ago
- See Through Your Models☆400Updated 3 months ago
- ☆198Updated 5 months ago
- Pixelagent — Multimodal stateful agents☆220Updated 4 months ago
- A Python toolkit for chain-of-thought prompting 🐍☆175Updated 2 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆220Updated 10 months ago
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆443Updated 2 months ago
- A Model Context Protocol (MCP) server that provides tools for interacting with JMAP (JSON Meta Application Protocol) email servers. Built…☆150Updated 2 months ago
- ☆382Updated 2 months ago
- Multimodal RAG to search and interact locally with technical documents of any kind☆273Updated last week
- VSCode extension that demonstrates the use of large language models (LLMs) for active debugging of programs☆353Updated 8 months ago
- Build data processing and data analysis pipelines that leverage the power of LLMs 🧠☆226Updated last week
- Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes☆560Updated last week
- ☆150Updated 3 months ago
- Spegel - Reflect the web through AI☆316Updated 3 months ago
- Applying the ideas of Deepseek R1 to computer use☆216Updated 8 months ago
- llm plugin for Cerebras fast inference API☆31Updated 3 months ago
- A GTK graphical interface for chatting with large language models (LLMs)☆81Updated last month
- Implement recursion using English as the programming language and an LLM as the runtime.☆236Updated 2 years ago
- Run and explore Llama models locally with minimal dependencies on CPU☆189Updated last year