nvwacloud / tensorlinkLinks
Unlock Unlimited Potential! Share Your GPU Power Across Your Local Network!
☆64Updated 4 months ago
Alternatives and similar repositories for tensorlink
Users that are interested in tensorlink are comparing it to the libraries listed below
Sorting:
- Self-hosted huggingface mirror service. 自建huggingface镜像服务。☆195Updated 2 months ago
- LM inference server implementation based on *.cpp.☆279Updated last month
- Implementation of remote CUDA/OpenCL protocol☆37Updated 4 months ago
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)☆275Updated last year
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆208Updated last month
- Easier than K8s to lift and lower the gpu number of docker container and scale capacity size of volume.☆80Updated last year
- Open Source Text Embedding Models with OpenAI Compatible API☆160Updated last year
- ☆112Updated last year
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆86Updated last year
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆72Updated last year
- Using CRDs to manage GPU resources in Kubernetes.☆209Updated 2 years ago
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆165Updated 2 months ago
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆438Updated this week
- an MLOps/LLMOps platform☆234Updated 9 months ago
- ☆17Updated 2 years ago
- OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)☆275Updated 2 years ago
- OpenAIOS vGPU device plugin for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory, in order to allow app…☆578Updated last year
- MoonPalace(月宫)是由 Moonshot AI 月之暗面提供的 API 调试工具。☆216Updated 9 months ago
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆127Updated 3 years ago
- A shim driver allows in-docker nvidia-smi showing correct process list without modify anything☆94Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆131Updated last year
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆248Updated last year
- LLM Inference benchmark☆426Updated last year
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆132Updated 3 weeks ago
- C++ implementation of Qwen-LM☆606Updated 10 months ago
- ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆260Updated last week
- run DeepSeek-R1 GGUFs on KTransformers☆252Updated 7 months ago
- Triton Inference Server Web UI☆15Updated last year
- A user gateway that provides serverless AIGC experience.☆44Updated last year
- Device plugins for Volcano, e.g. GPU☆129Updated 6 months ago