maaaxinfinity / ktrunLinks
KTransformers 一键部署脚本
☆57Updated 9 months ago
Alternatives and similar repositories for ktrun
Users that are interested in ktrun are comparing it to the libraries listed below
Sorting:
- run DeepSeek-R1 GGUFs on KTransformers☆260Updated 10 months ago
- vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆365Updated last month
- High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.☆1,387Updated last week
- ☆173Updated 10 months ago
- A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations☆44Updated 8 months ago
- LM inference server implementation based on *.cpp.☆294Updated 2 months ago
- LLM 并发性能测试工具,支持自动化压力测试和性能报告生成。☆211Updated last month
- ☆44Updated 8 months ago
- LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features …☆118Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆76Updated last year
- Community maintained hardware plugin for vLLM on Ascend☆1,597Updated this week
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆236Updated 3 weeks ago
- triton for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆40Updated last month
- 使用open-webui中的pipelines技术在open-webui中调用ragflow的agent实现基于知识库的智能对话,并拥有美观的界面。☆158Updated 3 months ago
- Ollama Desktop是基于Ollama引擎的一个桌面应用解决方案,用于在macOS、Windows和Linux操作系统上运行和管理Ollama模型的GUI工具。☆180Updated 6 months ago
- torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics c…☆469Updated last week
- AI虚拟伙伴Linux版☆123Updated last month
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模 型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆263Updated 10 months ago
- Low-bit LLM inference on CPU/NPU with lookup table☆915Updated 7 months ago
- A tool for creating pre-training datasets for language models, supporting one-click batch processing for both text and image datasets. 一个…☆43Updated last year
- 纯c++的全平台llm加速库,支持python调用,支持chatglm-6B, llama, baichuan, moss基座,x86 / ARM☆12Updated last week
- Ragflow-Plus 是 Ragflow 的二次开发版本,使其更为简洁实用☆1,201Updated last month
- LAYRA—an enterprise-ready, out-of-the-box solution—unlocks next-generation intelligent systems powered by visual RAG and limitless visual…☆897Updated 3 months ago
- Performance-Optimized AI Inference on Your GPUs. Unlock it by selecting and tuning the optimal inference engine for your model.☆4,448Updated this week
- pretrain a wiki llm using transformers☆61Updated last year
- ☆70Updated last week
- ☆341Updated 3 months ago
- gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图 片编辑和文生视频的开源框架。☆244Updated 2 weeks ago
- The main repository for building Pascal-compatible versions of ML applications and libraries.☆163Updated 5 months ago
- Get up and running with Llama 3, Mistral, Gemma 2, and other large language models.☆24Updated 7 months ago