nvwacloud / tensorlinkLinks
Unlock Unlimited Potential! Share Your GPU Power Across Your Local Network!
☆61Updated last month
Alternatives and similar repositories for tensorlink
Users that are interested in tensorlink are comparing it to the libraries listed below
Sorting:
- LM inference server implementation based on *.cpp.☆236Updated this week
- Self-hosted huggingface mirror service. 自建huggingface镜像服务。☆176Updated 2 months ago
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)☆270Updated last year
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆185Updated last week
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆134Updated last week
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆387Updated this week
- OpenAI compatible API for TensorRT LLM triton backend☆209Updated 11 months ago
- ☆110Updated last year
- LLM Inference benchmark☆422Updated 11 months ago
- Comparison of Language Model Inference Engines☆219Updated 7 months ago
- MoonPalace(月宫)是由 Moonshot AI 月之暗面提供的 API 调试工具。☆197Updated 6 months ago
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆243Updated last year
- Implementation of remote CUDA/OpenCL protocol☆36Updated last month
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆85Updated last year
- Easier than K8s to lift and lower the gpu number of docker container and scale capacity size of volume.☆77Updated last year
- A shim driver allows in-docker nvidia-smi showing correct process list without modify anything☆88Updated 2 weeks ago
- run DeepSeek-R1 GGUFs on KTransformers☆242Updated 4 months ago
- 🔧 Repair JSON!Solution for JSON Anomalies from LLMs.☆268Updated last month
- Library for model distillation☆146Updated 5 months ago
- an MLOps/LLMOps platform☆230Updated 6 months ago
- ☆17Updated 2 years ago
- 支持中文场景的的小语言模型 llama2.c-zh☆147Updated last year
- ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!☆224Updated last week
- A diverse, simple, and secure all-in-one LLMOps platform☆107Updated 9 months ago
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆126Updated 3 years ago
- ☆428Updated last week
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆160Updated last week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆129Updated last week
- A high-performance inference system for large language models, designed for production environments.☆455Updated this week
- LLM inference in C/C++☆94Updated this week