Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across vLLM, TRT-LLM, TokenSpeed, SGLang, OpenAI, Gemini & more. Industry-first gRPC pipeline, KV cache-aware routing, chat history, tokenization caching, Responses API, embeddings, WASM plugins, MCP, and multi-tenant auth.
☆328Jun 12, 2026Updated last week
Alternatives and similar repositories for smg
Users that are interested in smg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Incubating P/D sidecar for llm-d☆17Nov 13, 2025Updated 7 months ago
- A PyTorch native library for training speculative decoding models☆159Updated this week
- This CLI tool and Python3 module collects the current system state for documentation☆26Apr 9, 2026Updated 2 months ago
- A storage plugin that provided CRI-O/Podman with the ability to lazy mount nydus images.☆43May 12, 2025Updated last year
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆303May 14, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Kubernetes CSI Driver for serving OCI model artifacts☆27May 25, 2026Updated 3 weeks ago
- A Distributed Engine for AI☆49Jun 4, 2026Updated 2 weeks ago
- A userspace filesystem backing by Apache OpenDAL.☆38Jun 2, 2026Updated 2 weeks ago
- A high-performance and light-weight router for vLLM large scale deployment☆268May 6, 2026Updated last month
- See vLLM official support: https://github.com/vllm-project/vllm-ascend☆11Feb 5, 2025Updated last year
- upstream development the newest feature about HV.☆10Sep 12, 2021Updated 4 years ago
- Python script that generates HAProxy config for connection to standby stolon replicas☆19Mar 9, 2025Updated last year
- ☆26Jun 8, 2026Updated last week
- Injector trait as a webhook to inject data into Workload.☆15Apr 14, 2021Updated 5 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- HeraclesQL is a Python DSL for writing alerts!☆47May 8, 2026Updated last month
- ☆13Dec 20, 2016Updated 9 years ago
- Collection of docker images, helm charts and other tools needed to build DataLake on Kubernetes.☆13Oct 19, 2018Updated 7 years ago
- KV cache store for distributed LLM inference☆421Nov 13, 2025Updated 7 months ago
- alibabacloud-aiacc-demo☆43May 4, 2023Updated 3 years ago
- Expert Specialization MoE Solution based on CUTLASS☆27Apr 14, 2026Updated 2 months ago
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆468Updated this week
- An OS kernel module for fast **remote** fork using advanced datacenter networking (RDMA).☆72Feb 15, 2025Updated last year
- GPU analyzer for Kubernetes GPU clusters☆16Apr 11, 2020Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Userscripts and userstyles with quality of life improvements for Bitbucket, Jira, and Confluence☆27Jun 10, 2026Updated last week
- A Triton JIT runtime and ffi provider in C++☆35May 27, 2026Updated 3 weeks ago
- Resource Exporter for volcano scheduling, e.g. NUMA-Aware scheduling.☆19May 30, 2025Updated last year
- ☆14Dec 20, 2024Updated last year
- ☆15Mar 16, 2018Updated 8 years ago
- a simple traceroute tool for iOS☆14Oct 25, 2017Updated 8 years ago
- From Minimal GEMM to Everything☆220Jun 8, 2026Updated last week
- SGLang is a fast serving framework for large language models and vision language models.☆26Updated this week
- 能够远程办公(work from home)的公司名单☆16Mar 2, 2022Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆34Mar 12, 2026Updated 3 months ago
- FlashSampling: Fast and Memory-Efficient Exact Sampling (https://huggingface.co/papers/2603.15854)☆74May 13, 2026Updated last month
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- Workflow Defined Engine☆25Nov 4, 2025Updated 7 months ago
- Adapted iPerf3 iOS sample☆12Mar 15, 2017Updated 9 years ago
- Kubernetes Initializer that injects the Istio sidecar into pods.☆24Jul 13, 2017Updated 8 years ago
- Document Automation Reference Kit☆16Jun 27, 2024Updated last year