vtuber-plan / olahLinks
Self-hosted huggingface mirror service. 自建huggingface镜像服务。
☆212Updated 6 months ago
Alternatives and similar repositories for olah
Users that are interested in olah are comparing it to the libraries listed below
Sorting:
- A shim driver allows in-docker nvidia-smi showing correct process list without modify anything☆102Updated 7 months ago
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)☆280Updated 2 years ago
- xet client tech, used in huggingface_hub☆403Updated this week
- LM inference server implementation based on *.cpp.☆295Updated 2 months ago
- Open Source Text Embedding Models with OpenAI Compatible API☆167Updated last year
- OpenAI compatible API for TensorRT LLM triton backend☆220Updated last year
- 🚢 Yet another operator for running large language models on Kubernetes with ease. Powered by Ollama! 🐫☆229Updated this week
- ☆280Updated last week
- ☆541Updated 4 months ago
- Module, Model, and Tensor Serialization/Deserialization☆287Updated this week
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆240Updated last month
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆192Updated last month
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆78Updated last year
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆30Updated 10 months ago
- NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes☆155Updated last week
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆475Updated this week
- Getting Started with the CoreWeave Kubernetes GPU Cloud☆79Updated 7 months ago
- ⚡️ 80x faster Fasttext language detection out of the box | Split text by language☆290Updated 4 months ago
- The main repository for building Pascal-compatible versions of ML applications and libraries.☆169Updated 5 months ago
- Inference server benchmarking tool☆142Updated 4 months ago
- The LLM API Benchmark Tool is a flexible Go-based utility designed to measure and analyze the performance of OpenAI-compatible API endpoi…☆68Updated 3 months ago
- Self-host LLMs with vLLM and BentoML☆168Updated 3 weeks ago
- Comparison of Language Model Inference Engines☆239Updated last year
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆843Updated this week
- a huggingface mirror site.☆326Updated last year
- Common recipes to run vLLM☆368Updated this week
- 🪶 Lightweight OpenAI drop-in replacement for Kubernetes☆147Updated 2 years ago
- FRP Fork☆177Updated 10 months ago
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang☆100Updated this week
- Practical GPU Sharing Without Memory Size Constraints☆304Updated 10 months ago