ubergarm / r1-ktransformers-guide
run DeepSeek-R1 GGUFs on KTransformers
☆212Updated 3 weeks ago
Alternatives and similar repositories for r1-ktransformers-guide:
Users that are interested in r1-ktransformers-guide are comparing it to the libraries listed below
- High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.☆1,053Updated this week
- LM inference server implementation based on *.cpp.☆154Updated this week
- Community maintained hardware plugin for vLLM on Ascend☆393Updated this week
- ☆310Updated 3 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆240Updated 3 weeks ago
- gpt_server是一个用于生产级部署LLMs或Embedding的开源框架。☆161Updated this week
- Mixture-of-Experts (MoE) Language Model☆185Updated 6 months ago
- A streamlined and customizable framework for efficient large model evaluation and performance benchmarking☆676Updated this week
- GLM Series Edge Models☆131Updated last month
- CPU inference for the DeepSeek family of large language models in pure C++☆282Updated last month
- ☆227Updated 3 months ago
- Build & Optimize your RAG.☆588Updated last week
- A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations☆31Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆36Updated 2 months ago
- ☆218Updated last month
- ☆124Updated last month
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆135Updated 2 weeks ago
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆74Updated this week
- LLM Inference benchmark☆405Updated 8 months ago
- A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬☆57Updated 2 weeks ago
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆242Updated 2 months ago
- ☆314Updated 9 months ago
- This is InfiniRetri, a tool enhance Transformer-based LLMs(Large Language Model) ablity to hangle Long-Context.☆79Updated this week
- Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.☆540Updated 8 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆132Updated 3 months ago
- ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents☆412Updated last week
- Manage GPU clusters for running AI models☆2,279Updated this week
- DeepSeek 系列工作解读、扩展和复现。☆614Updated this week
- ☆556Updated last week
- 研究GOT-OCR-项目落地加速,不限语言☆59Updated 5 months ago