bentoml / llm-inference-handbookLinks
Everything you need to know about LLM inference
☆257Updated last week
Alternatives and similar repositories for llm-inference-handbook
Users that are interested in llm-inference-handbook are comparing it to the libraries listed below
Sorting:
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆225Updated last year
- Securely run AI-generated code in stateful sandboxes that run forever.☆223Updated 8 months ago
- This is a framework that implements various parallel reasoning strategies from the literature☆274Updated 3 weeks ago
- Persistent memory for LLMs and apps. Content-addressed storage with dedupe, compression, full-text and vector search.☆359Updated last week
- High-Performance Implementation of OpenAI's TikToken.☆466Updated 6 months ago
- Build Secure and Compliant AI agents and MCP Servers. YC W23☆156Updated 7 months ago
- Animating R1's thoughts.☆385Updated 10 months ago
- Physical AI Assistant that illuminates your life☆191Updated 3 months ago
- See Through Your Models☆401Updated 6 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆222Updated last year
- A Python toolkit for chain-of-thought prompting 🐍☆182Updated 3 weeks ago
- Applying the ideas of Deepseek R1 to computer use☆220Updated 11 months ago
- ☆199Updated 8 months ago
- Pixelagent — Multimodal stateful agents☆223Updated 7 months ago
- llm plugin for Cerebras fast inference API☆34Updated 5 months ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆239Updated last month
- Build data processing and data analysis pipelines that leverage the power of LLMs 🧠☆245Updated last month
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆447Updated last month
- LLM plugin for pulling content from Hacker News☆124Updated 8 months ago
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆228Updated 6 months ago
- Taming LLMs: A Practical Guide to LLM Pitfalls with Open Source Software☆335Updated 11 months ago
- Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit …☆362Updated 7 months ago
- Your filesystem as a vector database☆497Updated 8 months ago
- Action library for AI Agent☆230Updated 9 months ago
- Heirarchical Navigable Small Worlds☆101Updated 5 months ago
- Testing WASM-powered AI agents☆199Updated 3 months ago
- A comprehensive Model Context Protocol (MCP) server implementing the latest specification.☆334Updated 6 months ago
- Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes☆727Updated this week
- Run and explore Llama models locally with minimal dependencies on CPU☆190Updated last year
- Dead Simple LLM Abliteration☆245Updated 10 months ago