lemonade-sdk / lemonadeLinks
Local LLM Server with GPU and NPU Acceleration
☆138Updated this week
Alternatives and similar repositories for lemonade
Users that are interested in lemonade are comparing it to the libraries listed below
Sorting:
- Lightweight Inference server for OpenVINO☆187Updated last week
- No-code CLI designed for accelerating ONNX workflows☆198Updated 2 weeks ago
- InferX is a Inference Function as a Service Platform☆111Updated last week
- llama.cpp fork with additional SOTA quants and improved performance☆608Updated this week
- Minimal Linux OS with a Model Context Protocol (MCP) gateway to expose local capabilities to LLMs.☆247Updated this week
- ☆78Updated this week
- Sparse Inferencing for transformer based LLMs☆183Updated this week
- 1.58 Bit LLM on Apple Silicon using MLX☆214Updated last year
- FastMLX is a high performance production ready API to host MLX models.☆308Updated 3 months ago
- Wraps any OpenAI API interface as Responses with MCPs support so it supports Codex. Adding any missing stateful features. Ollama and Vllm…☆58Updated last week
- LLM inference in C/C++☆77Updated this week
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆86Updated this week
- Fully Open Language Models with Stellar Performance☆231Updated 2 weeks ago
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆136Updated 3 weeks ago
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆48Updated last month
- Turns devices into a scalable LLM platform☆144Updated last week
- ☆234Updated this week
- Train Large Language Models on MLX.☆94Updated this week
- ☆204Updated last month
- ☆95Updated 6 months ago
- Download models from the Ollama library, without Ollama☆86Updated 7 months ago
- Run LLM Agents on Ryzen AI PCs in Minutes☆421Updated last week
- ☆28Updated 3 months ago
- MockLLM, when you want it to do what you tell it to do!☆54Updated last week
- An innovative library for efficient LLM inference via low-bit quantization☆349Updated 9 months ago
- This is a cross-platform desktop application that allows you to chat with locally hosted LLMs and enjoy features like MCP support☆221Updated last week
- CPU inference for the DeepSeek family of large language models in C++☆302Updated 3 weeks ago
- automatically quant GGUF models☆184Updated last week
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆408Updated last week
- LLM inference in C/C++☆21Updated 3 months ago