kvcache-ai / ktransformersLinks
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
☆16,458Updated this week
Alternatives and similar repositories for ktransformers
Users that are interested in ktransformers are comparing it to the libraries listed below
Sorting:
- FlashMLA: Efficient Multi-head Latent Attention Kernels☆12,456Updated this week
- DeepEP: an efficient expert-parallel communication library☆8,967Updated this week
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆6,162Updated this week
- Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-p…☆9,029Updated this week
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆7,576Updated this week
- SGLang is a high-performance serving framework for large language models and multimodal models.☆23,439Updated this week
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆7,964Updated 8 months ago
- Performance-optimized AI inference on your GPUs. Unlock superior throughput by selecting and tuning engines like vLLM or SGLang.☆4,489Updated this week
- A simple screen parsing tool towards pure vision based GUI agent☆24,344Updated 4 months ago
- Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.☆7,563Updated 2 months ago
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆26,530Updated last month
- 🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!☆38,695Updated last week
- Toolkit for linearizing PDFs for LLM datasets/training☆16,860Updated this week
- Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.☆4,701Updated this week
- ☆4,612Updated last week
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (…☆12,594Updated this week
- Integrate the DeepSeek API into popular softwares☆35,349Updated 4 months ago
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆53,776Updated last week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆30,705Updated last week
- RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to creat…☆72,999Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆13,234Updated this week
- Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, a…☆4,640Updated this week
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.☆51,625Updated this week
- No fortress, purely open ground. OpenManus is Coming.☆54,333Updated last month
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,073Updated 11 months ago
- DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model☆4,995Updated last year
- A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.☆2,919Updated 3 weeks ago
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆9,792Updated 4 months ago
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆19,034Updated this week
- ⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with …☆3,538Updated this week