bentoml / llm-inference-handbookLinks
Everything you need to know about LLM inference
☆259Updated 3 weeks ago
Alternatives and similar repositories for llm-inference-handbook
Users that are interested in llm-inference-handbook are comparing it to the libraries listed below
Sorting:
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆227Updated last year
- High-Performance Implementation of OpenAI's TikToken.☆467Updated 6 months ago
- Securely run AI-generated code in stateful sandboxes that run forever.☆225Updated 9 months ago
- Persistent memory for LLMs and apps. Content-addressed storage with dedupe, compression, full-text and vector search.☆363Updated last week
- Animating R1's thoughts.☆384Updated 11 months ago
- This is a framework that implements various parallel reasoning strategies from the literature☆275Updated last month
- See Through Your Models☆400Updated 6 months ago
- Applying the ideas of Deepseek R1 to computer use☆221Updated 11 months ago
- A comprehensive Model Context Protocol (MCP) server implementing the latest specification.☆332Updated 7 months ago
- Build Secure and Compliant AI agents and MCP Servers. YC W23☆157Updated 7 months ago
- A Python toolkit for chain-of-thought prompting 🐍☆184Updated last month
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆222Updated last year
- llm plugin for Cerebras fast inference API☆34Updated 6 months ago
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆231Updated 7 months ago
- Pixelagent — Multimodal stateful agents☆225Updated 7 months ago
- LLM plugin for pulling content from Hacker News☆124Updated 8 months ago
- Build data processing and data analysis pipelines that leverage the power of LLMs 🧠☆246Updated 2 weeks ago
- ☆200Updated 8 months ago
- Cowork-like experience in the browser using filesystem api☆60Updated this week
- ☆463Updated 2 months ago
- Your filesystem as a vector database☆502Updated 9 months ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆251Updated 3 weeks ago
- Physical AI Assistant that illuminates your life☆191Updated 3 months ago
- Taming LLMs: A Practical Guide to LLM Pitfalls with Open Source Software☆337Updated 11 months ago
- Research projects☆296Updated this week
- AURA (Agent-Usable Resource Assertion) is an open protocol designed to make the web machine-readable. It replaces fragile screen scraping…☆102Updated last week
- Modular, open source LLMOps stack that separates concerns: LiteLLM unifies LLM APIs, manages routing and cost controls, and ensures high-…☆132Updated 11 months ago
- ☆150Updated 6 months ago
- Testing WASM-powered AI agents☆204Updated 4 months ago
- This repo tracks the opened and merged PRs by the top SWE coding agents by OpenAI, GitHub, and others. Updates regularly.☆298Updated this week