SGLang kernel library for NPU
☆115Apr 9, 2026Updated last week
Alternatives and similar repositories for sgl-kernel-npu
Users that are interested in sgl-kernel-npu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Triton adapter for Ascend. Mirror of https://gitcode.com/ascend/triton-ascend☆119Updated this week
- ☆20Jun 13, 2025Updated 10 months ago
- MultiArchKernelBench: A Multi-Platform Benchmark for Kernel Generation☆52Mar 25, 2026Updated 3 weeks ago
- ☆122Sep 22, 2025Updated 6 months ago
- LeetGPU Solutions☆113Oct 9, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆74Updated this week
- ☆17Mar 26, 2025Updated last year
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆27Dec 17, 2024Updated last year
- Ascend TileLang adapter☆241Updated this week
- FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.☆59Feb 6, 2026Updated 2 months ago
- Community maintained hardware plugin for vLLM on Ascend☆1,900Updated this week
- mllm-npu: training multimodal large language models on Ascend NPUs☆94Aug 29, 2024Updated last year
- ☆14Nov 3, 2025Updated 5 months ago
- A PyTorch native platform for training generative AI models☆16Nov 18, 2025Updated 4 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Big Data and Machine Intelligence, Spring 2021.☆12Jul 2, 2021Updated 4 years ago
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆13Jan 16, 2026Updated 3 months ago
- LMCache on Ascend☆61Apr 9, 2026Updated last week
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- DLBlas: clean and efficient kernels☆36Apr 7, 2026Updated last week
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆30Mar 25, 2026Updated 3 weeks ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 6 months ago
- ☆22Dec 18, 2024Updated last year
- A Triton-only attention backend for vLLM☆25Mar 17, 2026Updated 3 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Efficient and easy multi-instance LLM serving☆543Mar 12, 2026Updated last month
- Cataloging released Triton kernels.☆301Sep 9, 2025Updated 7 months ago
- ☆12Oct 19, 2014Updated 11 years ago
- [Ebook]从零到百万店铺:一个没有计算机学位的普通人的系统设计实战之旅☆26Nov 11, 2025Updated 5 months ago
- A Triton JIT runtime and ffi provider in C++☆32Updated this week
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆37Aug 29, 2025Updated 7 months ago
- ☆15Nov 19, 2018Updated 7 years ago
- ☆11Nov 13, 2020Updated 5 years ago
- A benchmark framework for LLM serving performance, based on API call☆14Apr 15, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 8 months ago
- Simodense: a RISC-V softcore for custom SIMD instructions☆17Feb 16, 2026Updated 2 months ago
- 企业事件抽取☆13May 20, 2021Updated 4 years ago
- WaferLLM: Large Language Model Inference at Wafer Scale☆97Apr 4, 2026Updated last week
- MultiPaxos and Disk Paxos in TLA+ and PlusCal☆13Jan 23, 2023Updated 3 years ago
- Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)☆15Jan 6, 2026Updated 3 months ago
- sgl-mindspore☆17Mar 23, 2026Updated 3 weeks ago