nvwacloud / tensorlinkLinks
Unlock Unlimited Potential! Share Your GPU Power Across Your Local Network!
☆66Updated 5 months ago
Alternatives and similar repositories for tensorlink
Users that are interested in tensorlink are comparing it to the libraries listed below
Sorting:
- LM inference server implementation based on *.cpp.☆286Updated 2 months ago
 - Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)☆275Updated 2 years ago
 - Self-hosted huggingface mirror service. 自建huggingface镜像服务。☆201Updated 3 months ago
 - Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆212Updated 2 months ago
 - Implementation of remote CUDA/OpenCL protocol☆37Updated 5 months ago
 - A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆169Updated 3 months ago
 - Open Source Text Embedding Models with OpenAI Compatible API☆160Updated last year
 - run DeepSeek-R1 GGUFs on KTransformers☆254Updated 8 months ago
 - LLM Inference benchmark☆428Updated last year
 - ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆263Updated this week
 - A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆75Updated last year
 - A high-throughput and memory-efficient inference and serving engine for LLMs☆131Updated last year
 - OpenAI compatible API for TensorRT LLM triton backend☆216Updated last year
 - Easier than K8s to lift and lower the gpu number of docker container and scale capacity size of volume.☆81Updated last year
 - GPUd automates monitoring, diagnostics, and issue identification for GPUs☆441Updated last week
 - A high-performance inference system for large language models, designed for production environments.☆481Updated 3 weeks ago
 - Using CRDs to manage GPU resources in Kubernetes.☆209Updated 2 years ago
 - ☆17Updated 2 years ago
 - Comparison of Language Model Inference Engines☆233Updated 10 months ago
 - MoonPalace(月宫)是由 Moonshot AI 月之暗面提供的 API 调试工具。☆216Updated 10 months ago
 - A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆128Updated 3 years ago
 - Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆249Updated last year
 - A diverse, simple, and secure all-in-one LLMOps platform☆109Updated last year
 - ☆431Updated last month
 - OpenAIOS vGPU device plugin for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory, in order to allow app…☆581Updated last year
 - NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes☆144Updated last week
 - C++ implementation of Qwen-LM☆605Updated 10 months ago
 - ☆112Updated last year
 - Pretrain, finetune and serve LLMs on Intel platforms with Ray☆132Updated last month
 - Python actor framework for heterogeneous computing.☆162Updated last week