vproxy-tools / ktransformersLinks
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
☆44Updated 8 months ago
Alternatives and similar repositories for ktransformers
Users that are interested in ktransformers are comparing it to the libraries listed below
Sorting:
- run DeepSeek-R1 GGUFs on KTransformers☆260Updated 10 months ago
- Efficient inference of large language models.☆149Updated 4 months ago
- Janus-Series: Unified Multimodal Understanding and Generation Models forked from deepseek-ai/Janus☆17Updated last year
- ☆19Updated 10 months ago
- Static suckless single batch CUDA-only qwen3-0.6B mini inference engine☆542Updated 4 months ago
- 电子鹦鹉 / Toy Language Model☆258Updated last week
- 支持中文场景的的小语言模型 llama2.c-zh☆150Updated last year
- CPU inference for the DeepSeek family of large language models in C++☆317Updated 3 months ago
- Ollama 模型 Registry 镜像站 / 加速器,让 Ollama 从 ModelScope 魔搭 更快的 拉取 / 下载 模型。☆111Updated 9 months ago
- High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.☆1,387Updated last week
- KTransformers 一键部署脚本☆57Updated 9 months ago
- LM inference server implementation based on *.cpp.☆294Updated 2 months ago
- MoonPalace(月宫)是由 Moonshot AI 月之暗面提供的 API 调试工具。☆219Updated last year
- an open high-performance Optical Character Recognition (OCR) toolkit☆306Updated 6 months ago
- ☆16Updated last year
- C++ implementation of Qwen-LM☆616Updated last year
- ☆149Updated last year
- The inference code of RVC-Boss/GPT-SoVITS that can be developer-friendly.☆16Updated last year
- a huggingface mirror site.☆326Updated last year
- 大模型中文测试题库-民间版本☆94Updated 2 years ago
- 轻量级高性能中文分词项目☆199Updated 2 years ago
- This sample shows how to deploy Qwen2 using OpenVINO☆39Updated last year
- 360zhinao☆291Updated 8 months ago
- ☆114Updated last year
- LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features …☆118Updated this week
- Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory☆29Updated last year
- CodeShell model in C/C++☆106Updated last year
- ☆135Updated 11 months ago
- Phi3 中文后训练模型仓库☆324Updated last year
- all-in-one clash server and control panel☆16Updated 8 months ago