deepseek-ai / DeepSeek-LLMLinks

DeepSeek LLM: Let there be answers

☆6,457

Alternatives and similar repositories for DeepSeek-LLM

Users that are interested in DeepSeek-LLM are comparing it to the libraries listed below

Sorting:

deepseek-ai / DeepSeek-Coder-V2
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
☆5,913Updated 9 months ago
deepseek-ai / DeepSeek-Math
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
☆2,807Updated last year
deepseek-ai / DeepSeek-Coder
DeepSeek Coder: Let the Code Write Itself
☆21,866Updated last year
deepseek-ai / DeepSeek-VL2
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
☆4,947Updated 4 months ago
deepseek-ai / DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
☆3,913Updated last year
deepseek-ai / DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
☆4,920Updated 9 months ago
deepseek-ai / Janus
Janus-Series: Unified Multimodal Understanding and Generation Models
☆17,447Updated 5 months ago
deepseek-ai / awesome-deepseek-coder
A curated list of open-source projects related to DeepSeek Coder
☆710Updated last year
MoonshotAI / Kimi-k1.5
☆3,392Updated 4 months ago
QwenLM / Qwen3
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
☆22,507Updated 2 weeks ago
deepseek-ai / DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
☆1,747Updated last year
deepseek-ai / DeepSeek-V3
☆98,143Updated 2 weeks ago
deepseek-ai / awesome-deepseek-integration
Integrate the DeepSeek API into popular softwares
☆33,114Updated 2 months ago
deepseek-ai / ESFT
Expert Specialized Fine-Tuning
☆652Updated last month
deepseek-ai / DeepSeek-R1
☆90,485Updated 2 weeks ago
QwenLM / Qwen2.5-Coder
Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
☆5,085Updated 3 weeks ago
deepseek-ai / DeepEP
DeepEP: an efficient expert-parallel communication library
☆8,265Updated this week
huggingface / open-r1
Fully open reproduction of DeepSeek-R1
☆25,011Updated this week
QwenLM / Qwen2.5-VL
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆11,442Updated last month
QwenLM / Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
☆18,667Updated 3 weeks ago
deepseek-ai / FlashMLA
FlashMLA: Efficient MLA decoding kernels
☆11,642Updated 2 months ago
deepseek-ai / DeepGEMM
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
☆5,517Updated last week
MiniMax-AI / MiniMax-01
The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention
☆3,016Updated last week
deepseek-ai / open-infra-index
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
☆7,860Updated last month
QwenLM / Qwen-Agent
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
☆9,985Updated 3 weeks ago
Tencent-Hunyuan / Tencent-Hunyuan-Large
☆1,559Updated 7 months ago
meta-llama / llama-stack
Composable building blocks to build Llama Apps
☆7,907Updated this week
simplescaling / s1
s1: Simple test-time scaling
☆6,487Updated 2 weeks ago
kvcache-ai / ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
☆14,574Updated this week
Jiayi-Pan / TinyZero
Minimal reproduction of DeepSeek R1-Zero
☆11,997Updated 2 months ago