nvwacloud / tensorlinkLinks
Unlock Unlimited Potential! Share Your GPU Power Across Your Local Network!
☆60Updated last month
Alternatives and similar repositories for tensorlink
Users that are interested in tensorlink are comparing it to the libraries listed below
Sorting:
- Implementation of remote CUDA/OpenCL protocol☆36Updated last month
- LM inference server implementation based on *.cpp.☆226Updated this week
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆177Updated last week
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)☆269Updated last year
- pure go for rwkv☆19Updated last year
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆125Updated 3 years ago
- OpenAI compatible API for TensorRT LLM triton backend☆209Updated 10 months ago
- xllamacpp - a Python wrapper of llama.cpp☆44Updated last week
- A simple, High-Performance, Scalable ML/DL Models Repository based on OCI Artifacts☆33Updated last year
- Using CRDs to manage GPU resources in Kubernetes.☆201Updated 2 years ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆67Updated last year
- 支持中文场景的的小语言模型 llama2.c-zh☆147Updated last year
- LLM Inference benchmark☆421Updated 11 months ago
- Python actor framework for heterogeneous computing.☆152Updated this week
- A shim driver allows in-docker nvidia-smi showing correct process list without modify anything☆87Updated 3 months ago
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆377Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆131Updated last year
- Golang SDK for langgenius/dify .☆29Updated last year
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆129Updated 2 weeks ago
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆49Updated 11 months ago
- Device-plugin for volcano vgpu which support hard resource isolation☆91Updated last week
- run DeepSeek-R1 GGUFs on KTransformers☆236Updated 3 months ago
- LLM inference in C/C++☆21Updated 3 months ago
- ☆53Updated 6 months ago
- Implementation of the RWKV language model in pure WebGPU/Rust.☆310Updated last week
- C++ implementation of Qwen-LM☆595Updated 6 months ago
- Go framework for DL model inference and API deployment☆49Updated 6 months ago
- ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆209Updated this week
- A diverse, simple, and secure all-in-one LLMOps platform☆105Updated 9 months ago
- Open Source Text Embedding Models with OpenAI Compatible API☆154Updated 11 months ago