kvcache-ai / ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
☆13,611Updated this week
Alternatives and similar repositories for ktransformers:
Users that are interested in ktransformers are comparing it to the libraries listed below
- FlashMLA: Efficient MLA decoding kernels☆11,448Updated last month
- DeepEP: an efficient expert-parallel communication library☆7,446Updated last week
- RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.☆49,576Updated this week
- Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.☆16,726Updated last month
- 🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.☆24,366Updated this week
- Integrate the DeepSeek API into popular softwares☆31,802Updated last week
- Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you ne…☆7,584Updated this week
- Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.☆5,527Updated this week
- A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with Anthropic Claude models.☆5,058Updated 2 months ago
- 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation☆15,697Updated this week
- A simple screen parsing tool towards pure vision based GUI agent☆21,635Updated 3 weeks ago
- SGLang is a fast serving framework for large language models and vision language models.☆13,368Updated this week
- Janus-Series: Unified Multimodal Understanding and Generation Models☆17,140Updated 2 months ago
- Finetune Llama 4, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥☆37,364Updated this week
- No fortress, purely open ground. OpenManus is Coming.☆43,678Updated this week
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆5,241Updated this week
- Toolkit for linearizing PDFs for LLM datasets/training☆11,187Updated this week
- Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.☆6,564Updated last week
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone☆19,251Updated last month
- The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.☆17,910Updated 3 weeks ago
- ☆4,193Updated last month
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆36,404Updated this week
- DeepSeek Coder: Let the Code Write Itself☆21,343Updated 11 months ago
- 分享一些好用的 Dify DSL 工作流程,自用、学习两相宜。 Sharing some Dify workflows.☆6,274Updated this week
- Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your p…☆41,142Updated this week
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆7,680Updated last week
- GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型☆6,318Updated this week
- SOTA Open Source TTS☆20,753Updated last week
- A powerful tool for creating fine-tuning datasets for LLM☆5,654Updated this week
- The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.☆3,392Updated this week