bentoml / llm-inference-handbookLinks
Everything you need to know about LLM inference
☆243Updated this week
Alternatives and similar repositories for llm-inference-handbook
Users that are interested in llm-inference-handbook are comparing it to the libraries listed below
Sorting:
- Content addressable storage with excellent search☆355Updated last week
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆225Updated 10 months ago
- High-Performance Implementation of OpenAI's TikToken.☆460Updated 4 months ago
- Securely run AI-generated code in stateful sandboxes that run forever.☆224Updated 7 months ago
- Parallel thinking for LLMs. Confidence‑gated, strategy‑driven, offline‑friendly☆258Updated 2 months ago
- LLM plugin for pulling content from Hacker News☆121Updated 6 months ago
- Applying the ideas of Deepseek R1 to computer use☆217Updated 9 months ago
- Build Secure and Compliant AI agents and MCP Servers. YC W23☆152Updated 5 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆222Updated 11 months ago
- Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes☆645Updated this week
- llm plugin for Cerebras fast inference API☆32Updated 3 months ago
- See Through Your Models☆401Updated 4 months ago
- Physical AI Assistant that illuminates your life☆188Updated last month
- ☆198Updated 6 months ago
- Git Based Memory Storage for Conversational AI Agent☆681Updated 2 months ago
- Pixelagent — Multimodal stateful agents☆221Updated 5 months ago
- Your filesystem as a vector database☆490Updated 6 months ago
- A Python toolkit for chain-of-thought prompting 🐍☆177Updated 2 months ago
- Animating R1's thoughts.☆386Updated 9 months ago
- ☆150Updated 4 months ago
- A comprehensive Model Context Protocol (MCP) server implementing the latest specification.☆335Updated 5 months ago
- VSCode extension that demonstrates the use of large language models (LLMs) for active debugging of programs☆356Updated 9 months ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆213Updated last week
- Fully neural approach for text chunking☆393Updated last month
- ☆455Updated 3 weeks ago
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆444Updated 3 months ago
- Live-bending a foundation model’s output at neural network level.☆270Updated 7 months ago
- Action library for AI Agent☆228Updated 7 months ago
- Modular, open source LLMOps stack that separates concerns: LiteLLM unifies LLM APIs, manages routing and cost controls, and ensures high-…☆125Updated 9 months ago
- Run and explore Llama models locally with minimal dependencies on CPU☆190Updated last year