bentoml / llm-inference-handbookLinks
Everything you need to know about LLM inference
☆237Updated last week
Alternatives and similar repositories for llm-inference-handbook
Users that are interested in llm-inference-handbook are comparing it to the libraries listed below
Sorting:
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆225Updated 9 months ago
- Securely run AI-generated code in stateful sandboxes that run forever.☆218Updated 5 months ago
- Content addressable storage with excellent search☆348Updated last week
- High-Performance Implementation of OpenAI's TikToken.☆455Updated 3 months ago
- LLM plugin for pulling content from Hacker News☆119Updated 5 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆219Updated 9 months ago
- Git Based Memory Storage for Conversational AI Agent☆654Updated last month
- ☆196Updated 5 months ago
- Parallel thinking for LLMs. Confidence‑gated, strategy‑driven, offline‑friendly☆254Updated 3 weeks ago
- Build Secure and Compliant AI agents and MCP Servers. YC W23☆152Updated 4 months ago
- See Through Your Models☆399Updated 3 months ago
- Applying the ideas of Deepseek R1 to computer use☆216Updated 8 months ago
- Physical AI Assistant that illuminates your life☆173Updated last week
- Animating R1's thoughts.☆384Updated 7 months ago
- Your filesystem as a vector database☆471Updated 5 months ago
- Pixelagent — Multimodal stateful agents☆218Updated 4 months ago
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆207Updated 3 months ago
- A Python toolkit for chain-of-thought prompting 🐍☆175Updated last month
- llm plugin for Cerebras fast inference API☆30Updated 2 months ago
- Fully neural approach for text chunking☆374Updated 5 months ago
- A comprehensive Model Context Protocol (MCP) server implementing the latest specification.☆334Updated 3 months ago
- Live-bending a foundation model’s output at neural network level.☆265Updated 6 months ago
- ☆440Updated last month
- Detect whether or not an audio file was generated by NotebookLM☆140Updated 10 months ago
- Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes☆442Updated this week
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆442Updated 2 months ago
- ☆361Updated last month
- Testing WASM-powered AI agents☆183Updated 3 weeks ago
- ☆150Updated 3 months ago
- Implement recursion using English as the programming language and an LLM as the runtime.☆236Updated 2 years ago