bentoml / llm-inference-handbookLinks
Everything you need to know about LLM inference
☆249Updated last week
Alternatives and similar repositories for llm-inference-handbook
Users that are interested in llm-inference-handbook are comparing it to the libraries listed below
Sorting:
- Content addressable storage with excellent search☆357Updated this week
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆225Updated 11 months ago
- High-Performance Implementation of OpenAI's TikToken.☆464Updated 5 months ago
- Parallel thinking for LLMs. Confidence‑gated, strategy‑driven, offline‑friendly☆274Updated 3 months ago
- Securely run AI-generated code in stateful sandboxes that run forever.☆224Updated 8 months ago
- A Python toolkit for chain-of-thought prompting 🐍☆180Updated this week
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆222Updated last year
- Pixelagent — Multimodal stateful agents☆223Updated 6 months ago
- Awesome Code Sandboxing for AI☆230Updated 3 weeks ago
- Build Secure and Compliant AI agents and MCP Servers. YC W23☆153Updated 6 months ago
- LLM plugin for pulling content from Hacker News☆122Updated 7 months ago
- Git Based Memory Storage for Conversational AI Agent☆757Updated 3 weeks ago
- See Through Your Models☆401Updated 5 months ago
- Applying the ideas of Deepseek R1 to computer use☆217Updated 10 months ago
- Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes☆697Updated this week
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆228Updated 3 weeks ago
- ☆150Updated 5 months ago
- Modular, open source LLMOps stack that separates concerns: LiteLLM unifies LLM APIs, manages routing and cost controls, and ensures high-…☆128Updated 10 months ago
- AURA (Agent-Usable Resource Assertion) is an open protocol designed to make the web machine-readable. It replaces fragile screen scraping…☆102Updated this week
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆446Updated 3 weeks ago
- ☆460Updated 3 weeks ago
- ☆199Updated 7 months ago
- Animating R1's thoughts.☆385Updated 10 months ago
- Fine-grained control over model context protocol (MCP) clients, servers, and tools. Context is God.☆113Updated 6 months ago
- EnrichMCP is a python framework for building data driven MCP servers☆631Updated this week
- A comprehensive Model Context Protocol (MCP) server implementing the latest specification.☆335Updated 5 months ago
- Action library for AI Agent☆230Updated 8 months ago
- Dead Simple LLM Abliteration☆243Updated 10 months ago
- Open source local sandboxing for running AI generated code.☆258Updated this week
- Physical AI Assistant that illuminates your life