vproxy-tools / ktransformersLinks
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
☆44Updated 8 months ago
Alternatives and similar repositories for ktransformers
Users that are interested in ktransformers are comparing it to the libraries listed below
Sorting:
- Efficient inference of large language models.☆150Updated 3 months ago
- run DeepSeek-R1 GGUFs on KTransformers☆259Updated 10 months ago
- 电子鹦鹉 / Toy Language Model☆252Updated this week
- Janus-Series: Unified Multimodal Understanding and Generation Models forked from deepseek-ai/Janus☆17Updated 11 months ago
- MoonPalace(月宫)是由 Moonshot AI 月之暗面提供的 API 调试工具。☆220Updated last year
- A repo for llm on ncnn☆169Updated last week
- CPU inference for the DeepSeek family of large language models in C++☆317Updated 3 months ago
- Static suckless single batch CUDA-only qwen3-0.6B mini inference engine☆538Updated 4 months ago
- ☆19Updated 9 months ago
- 支持中文场景的的小语言模型 llama2.c-zh☆150Updated last year
- High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.☆1,374Updated this week
- AI 技术分享频道相关文件☆91Updated this week
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆214Updated 2 months ago
- ☆136Updated 10 months ago
- The inference code of RVC-Boss/GPT-SoVITS that can be developer-friendly.☆16Updated last year
- C++ implementation of Qwen-LM☆614Updated last year
- This sample shows how to deploy Qwen2 using OpenVINO☆39Updated last year
- KTransformers 一键部署脚本☆55Updated 8 months ago
- ☆149Updated last year
- ☆94Updated 6 months ago
- run chatglm3-6b in BM1684X☆39Updated last year
- LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features …☆98Updated this week
- LM inference server implementation based on *.cpp.☆295Updated last month
- a huggingface mirror site.☆324Updated last year
- Compile & run a single CUDA file on the cloud GPUs☆14Updated last year
- Mission intent compiler and autonomy supervisor for unmanned systems.☆144Updated 3 weeks ago
- tensor library☆17Updated last year
- ☆114Updated last year
- TTS☆49Updated last year
- Wanna breeze through some papers?☆65Updated 2 months ago