vproxy-tools / ktransformersLinks
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
☆43Updated 5 months ago
Alternatives and similar repositories for ktransformers
Users that are interested in ktransformers are comparing it to the libraries listed below
Sorting:
- Efficient inference of large language models.☆149Updated last week
- run DeepSeek-R1 GGUFs on KTransformers☆252Updated 7 months ago
- Janus-Series: Unified Multimodal Understanding and Generation Models forked from deepseek-ai/Janus☆17Updated 8 months ago
- Static suckless single batch CUDA-only qwen3-0.6B mini inference engine☆499Updated last month
- 电子鹦鹉 / Toy Language Model☆197Updated 3 weeks ago
- High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.☆1,293Updated this week
- CPU inference for the DeepSeek family of large language models in C++☆313Updated last week
- JoySafety☆292Updated this week
- 支持中文场景的的小语言模型 llama2.c-zh☆150Updated last year
- ☆18Updated 6 months ago
- 轻量级高性能中文分词项目☆200Updated 2 years ago
- KTransformers 一键部署脚本☆51Updated 5 months ago
- ☆49Updated last month
- Wanna breeze through some papers?☆24Updated last week
- ☆112Updated last year
- a huggingface mirror site.☆305Updated last year
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆197Updated 3 weeks ago
- LM inference server implementation based on *.cpp.☆279Updated last month
- ncnn android robust video matting☆16Updated last week
- 0 -1 diy your agent cli.☆70Updated this week
- C++ implementation of Qwen-LM☆606Updated 10 months ago
- An AI agent to control drones from your CLI☆134Updated 2 months ago
- Ollama 模型 Registry 镜像站 / 加速器,让 Ollama 从 ModelScope 魔搭 更快的 拉取 / 下载 模型。☆103Updated 5 months ago
- 大模型中文测试题库-民间版本☆89Updated 2 years ago
- A minimal, easy-to-read PyTorch reimplementation of the Qwen3 and Qwen2.5 VL with a fancy CLI☆167Updated last month
- MoonPalace(月宫)是由 Moonshot AI 月之暗面提供的 API 调试工具。☆216Updated 9 months ago
- ☆105Updated 2 weeks ago
- 360zhinao☆290Updated 4 months ago
- Compile & run a single CUDA file on the cloud GPUs☆14Updated last year
- ☆135Updated 7 months ago