maaaxinfinity / ktrunLinks
KTransformers 一键部署脚本
☆57Updated 9 months ago
Alternatives and similar repositories for ktrun
Users that are interested in ktrun are comparing it to the libraries listed below
Sorting:
- run DeepSeek-R1 GGUFs on KTransformers☆261Updated 11 months ago
- vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆370Updated last month
- 一套基于Vllm的显存内存混合模式大模型部署工具(图形界面),VRAMandDRAM模式虽然慢一点,但是解决了超大模型在普通家用计算机上的部署问题。☆91Updated 9 months ago
- ☆173Updated 10 months ago
- High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.☆1,467Updated this week
- LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features …☆146Updated this week
- LM inference server implementation based on *.cpp.☆295Updated 2 months ago
- LLM 并发性能测试工具,支持自动化压力测试和性能报告生成。☆213Updated 2 months ago
- 使用open-webui中的pipelines技术在open-webui中调用ragflow的agent实现基于知识库的智能对话,并拥有美观的界面。☆158Updated 3 months ago
- ☆44Updated 9 months ago
- FORK of VLLM for AMD MI25/50/60. A high-throughput and memory-efficient inference and serving engine for LLMs☆65Updated 9 months ago
- Community maintained hardware plugin for vLLM on Ascend☆1,651Updated this week
- torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics c…☆475Updated this week
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆263Updated 10 months ago
- A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations☆44Updated 9 months ago
- Scripting tool for downloading Dify plugin package from Dify Marketplace and Github and repackaging [true] offline package.☆608Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆76Updated last year
- 大模型中文测试题库-民间版本☆95Updated 2 years ago
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆238Updated last month
- DIFY PULGIN 插件源码集合☆325Updated 8 months ago
- 🌈 MEGREZ | 🍒 Make Extendable GPU Resource EASY☆124Updated 2 months ago
- gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。☆244Updated last week
- 基于SenseVoice的funasr版本进行的api发布,可以无缝对接oneapi☆92Updated last year
- GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版(有gradio webui配置生成RAG索引,有fastapi提供RAG API服务)☆105Updated last year
- Run generative AI models in sophgo BM1684X/BM1688☆266Updated 3 weeks ago
- A code executor for Dify that is compatible with the official sandbox API calls and dependency installation.☆376Updated 9 months ago
- 为 Agent Swarm 打造的 IDE ,支持 Kimi-2.5, GLM-4.7 等(即使不像 K2.5 那样经过强化学习的模型也能用)An IDE built for Agent Swarms, supporting Kimi-2.5, GLM-4.7, and m…☆684Updated this week
- ktransformers v0.3 docker build and run☆13Updated 11 months ago
- AI虚拟伙伴Linux版☆123Updated 2 weeks ago
- Ragflow-Plus 是 Ragflow 的二次开发版本,使其更为简洁实用☆1,219Updated last month