vproxy-tools / ktransformersLinks
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
☆43Updated 3 months ago
Alternatives and similar repositories for ktransformers
Users that are interested in ktransformers are comparing it to the libraries listed below
Sorting:
- run DeepSeek-R1 GGUFs on KTransformers☆249Updated 5 months ago
- Efficient inference of large language models.☆150Updated last month
- 电子鹦鹉 / Toy Language Model☆193Updated last week
- Janus-Series: Unified Multimodal Understanding and Generation Models forked from deepseek-ai/Janus☆17Updated 6 months ago
- ☆17Updated 4 months ago
- This sample shows how to deploy Qwen2 using OpenVINO☆39Updated 10 months ago
- CPU inference for the DeepSeek family of large language models in C++☆308Updated 2 months ago
- 支持中文场景的的小语言模型 llama2.c-zh☆149Updated last year
- ☆47Updated this week
- 轻量级高性能中文分词项目☆200Updated 2 years ago
- An AI agent to control drones from your CLI☆123Updated this week
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆171Updated last week
- High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.☆1,232Updated this week
- MoonPalace(月宫)是由 Moonshot AI 月之暗面提供的 API 调试工具。☆203Updated 7 months ago
- run chatglm3-6b in BM1684X☆40Updated last year
- ☆135Updated 5 months ago
- 大模型中文测试题库-民间版本☆87Updated 2 years ago
- 世界上最好的提示词 (总计估值超过300亿的提示词)外国网友x1xh成功获取了 v0、Manus、Cursor、Same.dev 和 Lovable 的完整官方系统提示词和内部工具。☆156Updated 3 months ago
- ai法律团队☆41Updated 7 months ago
- C++ implementation of Qwen-LM☆607Updated 8 months ago
- KTransformers 一键部署脚本☆49Updated 3 months ago
- ☆149Updated last year
- LM inference server implementation based on *.cpp.☆250Updated this week
- Ollama 模型 Registry 镜像站 / 加速器,让 Ollama 从 ModelScope 魔搭 更快的 拉取 / 下载 模型。☆98Updated 3 months ago
- ☆91Updated last month
- FastAPI PaddleSpeech 音频录音转文字☆51Updated last year
- TTS☆49Updated last year
- 《机器学习工程》开源电子书,欢迎一起贡献完善《Machine Learning Engineering》中文版☆72Updated last year
- ☆104Updated last week
- Compile & run a single CUDA file on the cloud GPUs☆14Updated 11 months ago