kvcache-ai / ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
☆10,824Updated this week
Alternatives and similar repositories for ktransformers:
Users that are interested in ktransformers are comparing it to the libraries listed below
- The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.☆16,936Updated 2 weeks ago
- Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥☆31,152Updated this week
- Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.☆15,511Updated this week
- The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.☆2,706Updated this week
- SGLang is a fast serving framework for large language models and vision language models.☆10,325Updated this week
- Janus-Series: Unified Multimodal Understanding and Generation Models☆16,105Updated 2 weeks ago
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone☆18,531Updated this week
- 🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!☆11,363Updated this week
- Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.☆5,852Updated 3 weeks ago
- DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model☆4,769Updated 4 months ago
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆7,745Updated this week
- A simple screen parsing tool towards pure vision based GUI agent☆13,139Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆38,776Updated this week
- DeepSeek Coder: Let the Code Write Itself☆19,932Updated 9 months ago
- Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 15…☆5,674Updated this week
- Manage GPU clusters for running AI models☆1,700Updated this week
- ☆17,858Updated this week
- 《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程☆12,758Updated last week
- ☆2,197Updated this week
- GLM-4-Voice | 端到端中英语音对话模型☆2,669Updated 2 months ago
- ☆7,223Updated last month
- 🍒 Cherry Studio is a desktop client that supports for multiple LLM providers. Support deepseek-r1☆14,618Updated this week
- Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.☆4,510Updated last week
- Let your Claude able to think☆14,347Updated 3 weeks ago