bentoml / llm-inference-in-productionLinks
Everything you need to know about LLM inference
☆135Updated this week
Alternatives and similar repositories for llm-inference-in-production
Users that are interested in llm-inference-in-production are comparing it to the libraries listed below
Sorting:
- RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…☆222Updated 6 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆221Updated 6 months ago
- LLM plugin for pulling content from Hacker News☆114Updated 2 months ago
- ai for jq☆243Updated 9 months ago
- Your personal plug and play memory layer for LLMs☆336Updated this week
- High-Performance Implementation of OpenAI's TikToken.☆432Updated last week
- Large Language Model Thin Client agent☆104Updated this week
- CleverBee - The Open Source Deep Researcher Tool☆302Updated last month
- GUI for selecting text files for concatenation and submission to LLMs☆176Updated this week
- ☆278Updated last month
- Retrieval Augmented Generation based on SQLite☆253Updated this week
- Securely run AI-generated code in stateful sandboxes that run forever.☆205Updated 2 months ago
- Applying the ideas of Deepseek R1 to computer use☆214Updated 5 months ago
- A very simple tool to build LLM prompts from your code repositories.☆153Updated 4 months ago
- Merliot Device Hub☆144Updated last month
- Docker-based inference engine for AMD GPUs☆231Updated 9 months ago
- See Through Your Models☆398Updated this week
- Implement recursion using English as the programming language and an LLM as the runtime.☆237Updated 2 years ago
- Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit …☆356Updated last month
- Attempt to create an Open Source Privacy Focused Rewind.ai Alternative for data capture☆215Updated 5 months ago
- fractal-structure inspired, parent-children orbiting, zooming-elements based interactive graph visualization user interface☆131Updated 4 months ago
- Animating R1's thoughts.☆383Updated 4 months ago
- Your AI research assistant☆79Updated 3 months ago
- Fine-grained control over model context protocol (MCP) clients, servers, and tools. Context is God.☆111Updated last month
- Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…☆439Updated last week
- Browser-LLM Auto-Scaling Technology☆528Updated this week
- ☆196Updated 2 months ago
- Run and explore Llama models locally with minimal dependencies on CPU☆191Updated 9 months ago
- Build Secure and Compliant AI agents and MCP Servers. YC W23☆143Updated last month
- A comprehensive Model Context Protocol (MCP) server implementing the latest specification.☆326Updated 3 weeks ago