kvcache-ai / ktransformersLinks
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
☆14,423Updated last week
Alternatives and similar repositories for ktransformers
Users that are interested in ktransformers are comparing it to the libraries listed below
Sorting:
- FlashMLA: Efficient MLA decoding kernels☆11,623Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆50,358Updated this week
- SGLang is a fast serving framework for large language models and vision language models.☆15,276Updated this week
- Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.☆22,102Updated last week
- DeepEP: an efficient expert-parallel communication library☆8,194Updated this week
- 🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!☆22,230Updated last month
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.☆40,815Updated this week
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)☆52,785Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.☆9,608Updated this week
- Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you ne…☆8,074Updated this week
- RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.☆56,603Updated this week
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆7,835Updated last month
- Toolkit for linearizing PDFs for LLM datasets/training☆13,006Updated this week
- 🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.☆28,775Updated this week
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆11,130Updated last week
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆11,094Updated last month
- Fully open reproduction of DeepSeek-R1☆24,859Updated this week
- Integrate the DeepSeek API into popular softwares☆32,932Updated last month
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-…☆8,168Updated this week
- A simple screen parsing tool towards pure vision based GUI agent☆22,487Updated 2 months ago
- GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型☆6,630Updated last week
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone☆19,688Updated this week
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆17,151Updated this week
- No fortress, purely open ground. OpenManus is Coming.☆47,108Updated last week
- The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.☆18,538Updated last week
- A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。☆35,508Updated this week
- A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.☆14,765Updated this week
- Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.☆6,312Updated this week
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆14,672Updated last week
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆5,468Updated this week