nvwacloud / tensorlinkLinks
Unlock Unlimited Potential! Share Your GPU Power Across Your Local Network!
☆72Updated 8 months ago
Alternatives and similar repositories for tensorlink
Users that are interested in tensorlink are comparing it to the libraries listed below
Sorting:
- LM inference server implementation based on *.cpp.☆295Updated 2 months ago
- Implementation of remote CUDA/OpenCL protocol☆38Updated 8 months ago
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)☆279Updated 2 years ago
- Self-hosted huggingface mirror service. 自建huggingface镜像服务。☆212Updated 6 months ago
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆192Updated last month
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆238Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆132Updated last year
- OpenAI compatible API for TensorRT LLM triton backend☆220Updated last year
- Open Source Text Embedding Models with OpenAI Compatible API☆167Updated last year
- run DeepSeek-R1 GGUFs on KTransformers☆261Updated 11 months ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆78Updated last year
- LLM Inference benchmark☆433Updated last year
- ☆437Updated 4 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆274Updated 6 months ago
- 配合 HAI Platform 使用的集成化用户界面☆54Updated 2 years ago
- Comparison of Language Model Inference Engines☆239Updated last year
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆251Updated last year
- NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes☆155Updated this week
- a huggingface mirror site.☆326Updated last year
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆91Updated last year
- ☆68Updated last year
- A shim driver allows in-docker nvidia-smi showing correct process list without modify anything☆102Updated 7 months ago
- C++ implementation of Qwen-LM☆616Updated last year
- OpenAIOS vGPU device plugin for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory, in order to allow app…☆584Updated last year
- 一种任务级GPU算力分时调度的高性能深度学习训练平台☆737Updated 2 years ago
- Using CRDs to manage GPU resources in Kubernetes.☆210Updated 3 years ago
- cricket is a virtualization solution for GPUs☆234Updated 5 months ago
- xllamacpp - a Python wrapper of llama.cpp☆73Updated this week
- Efficient AI Inference & Serving☆479Updated 2 years ago
- ☆114Updated last year