inclusionAI/linghe

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/inclusionAI/linghe)

inclusionAI / linghe

A high-performance kernel library for LLM training

☆86

Alternatives and similar repositories for linghe

Users that are interested in linghe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

inclusionAI / flood
View on GitHub
☆15Feb 26, 2026Updated 5 months ago
inclusionAI / Ring
View on GitHub
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.
☆109Aug 5, 2025Updated 11 months ago
inclusionAI / cuLA
View on GitHub
CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.
☆535Jul 23, 2026Updated last week
inclusionAI / Ling
View on GitHub
Ling is a MoE LLM provided and open-sourced by InclusionAI.
☆262May 14, 2025Updated last year
inclusionAI / Ling-V2
View on GitHub
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI.
☆273Oct 4, 2025Updated 9 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
vllm-project / tml-fa4
View on GitHub
FA4-based Relative Attention Kernel developed by TML and Colfax
☆17Jul 17, 2026Updated last week
inclusionAI / Ring-V2
View on GitHub
Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.
☆98Oct 23, 2025Updated 9 months ago
inclusionAI / humming
View on GitHub
☆171Updated this week
Harry-Chen / fp4_sm120
View on GitHub
Make FP4 on 5090 Great Again
☆17Jul 20, 2026Updated last week
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆733Jul 4, 2026Updated 3 weeks ago
yanring / Megatron-MoE-ModelZoo
View on GitHub
Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.
☆201May 29, 2026Updated 2 months ago
zhehangdu / Newton-Muon
View on GitHub
The Newton-Muon optimizer
☆30Jun 5, 2026Updated last month
inclusionAI / Ling-V2.5
View on GitHub
☆28Feb 28, 2026Updated 5 months ago
NVIDIA / nccl-extensions
View on GitHub
Communication patterns for AI, built on top of NCCL device and host APIs
☆21Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
samsja / muon_fsdp_2
View on GitHub
Muon fsdp 2
☆64Aug 8, 2025Updated 11 months ago
leepoly / sm-profiler
View on GitHub
☆84Feb 5, 2026Updated 5 months ago
inclusionAI / asystem-amem
View on GitHub
A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.
☆113Dec 17, 2025Updated 7 months ago
ysy-phoenix / evalhub
View on GitHub
All-in-one benchmarking platform for evaluating LLM.
☆15Nov 12, 2025Updated 8 months ago
slime-n / slime-n
View on GitHub
A Multi-Policy, Multi-Agent RL Training Framework
☆32Jun 16, 2026Updated last month
Dots-Infra / UltraEP
View on GitHub
Production-ready expert load balancing library
☆80Jul 17, 2026Updated last week
graphcore-research / out-of-the-box-fp8-training
View on GitHub
Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.
☆46Jul 17, 2024Updated 2 years ago
cyhdmjzzy / DeepEP-Code-Analysis
View on GitHub
☆26Feb 27, 2026Updated 5 months ago
leloykun / adaptive-muon
View on GitHub
A single-line modification to any (dualizer-based) optimizer that allows the optimizer to adapt to the scale of the gradients as they cha…
☆19Jan 11, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
KuangjuX / NVSHMEM-Tutorial
View on GitHub
NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer
☆195Feb 11, 2026Updated 5 months ago
richardodliu / OpenCodeEval
View on GitHub
☆52Mar 9, 2026Updated 4 months ago
alipay / PainlessInferenceAcceleration
View on GitHub
Accelerate inference without tears
☆371Jan 23, 2026Updated 6 months ago
MoonshotAI / FlashKDA
View on GitHub
FlashKDA: high-performance Kimi Delta Attention kernels
☆982Updated this week
sgl-project / sgl-flash-attn
View on GitHub
Fast and memory-efficient exact attention
☆22Jun 26, 2026Updated last month
sablin39 / tilelang-cuda-skills
View on GitHub
Skills for writing tilelang and debugging with CUDA toolkits.
☆133May 20, 2026Updated 2 months ago
NVIDIA-NeMo / Emerging-Optimizers
View on GitHub
☆240Updated this week
HydraQYH / expert_specialization_moe
View on GitHub
Expert Specialization MoE Solution based on CUTLASS
☆27Apr 14, 2026Updated 3 months ago
inclusionAI / Ring-V2.5
View on GitHub
☆45Feb 28, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Triang-jyed-driung / i8muon
View on GitHub
Muon in Int8 Precision Made Possible
☆20Jun 18, 2026Updated last month
NVIDIA / CompileIQ
View on GitHub
An Optimizer for Nvidia Compilers.
☆112Jul 3, 2026Updated 3 weeks ago
perplexityai / pplx-kernels
View on GitHub
Perplexity GPU Kernels
☆595Nov 7, 2025Updated 8 months ago
tgale96 / grouped_gemm
View on GitHub
PyTorch bindings for CUTLASS grouped GEMM.
☆154May 29, 2025Updated last year
Bruce-Lee-LY / cuda_auto_tune
View on GitHub
NCU-driven iterative optimization workflow for CUDA/CUTLASS/Triton/CuTe DSL kernels.
☆24Apr 10, 2026Updated 3 months ago
Dao-AILab / fast-hadamard-transform
View on GitHub
Fast Hadamard transform in CUDA, with a PyTorch interface
☆343Mar 10, 2026Updated 4 months ago
XinbangZhang / DATA-NAS
View on GitHub
Codes for DATA: Differentiable ArchiTecture Approximation.
☆11Jul 22, 2021Updated 5 years ago