A domain-specific language (DSL) based on Triton but providing higher-level abstractions.
☆41Feb 4, 2026Updated 3 weeks ago
Alternatives and similar repositories for ninetoothed
Users that are interested in ninetoothed are comparing it to the libraries listed below
Sorting:
- ☆39Updated this week
- ☆18Updated this week
- ☆126Jan 22, 2026Updated last month
- ☆50Updated this week
- 算子库☆17Jul 9, 2025Updated 7 months ago
- ☆18Mar 4, 2025Updated 11 months ago
- 基于 CUDA Driver API 的 cuda 运行时环境☆15Jul 30, 2025Updated 7 months ago
- 🚀 LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.☆13Jul 12, 2025Updated 7 months ago
- handle gguf files☆12Aug 14, 2025Updated 6 months ago
- Automated bottleneck detection and solution orchestration☆19Updated this week
- ☆13Jan 7, 2025Updated last year
- 分层解耦的深度学习推理引擎☆79Feb 17, 2025Updated last year
- ☆289Feb 4, 2026Updated 3 weeks ago
- Framework to reduce autotune overhead to zero for well known deployments.☆96Sep 19, 2025Updated 5 months ago
- 算子库(Rust)☆14Jul 24, 2025Updated 7 months ago
- 训练营训练方向项目☆26Jan 28, 2026Updated last month
- A Triton-only attention backend for vLLM☆24Feb 11, 2026Updated 2 weeks ago
- Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels☆46Jan 30, 2026Updated last month
- ☆30Jan 9, 2026Updated last month
- A Triton JIT runtime and ffi provider in C++☆31Updated this week
- A lightweight triton-based General Matrix Multiplication (GEMM) library.☆47Updated this week
- Hypervisor written in Rust for the RISC-V 1.0 hypervisor extension☆16Oct 21, 2024Updated last year
- The Fundot programming language.☆14Dec 31, 2021Updated 4 years ago
- ☆32Jul 2, 2025Updated 7 months ago
- Ship correct and fast LLM kernels to PyTorch☆142Jan 14, 2026Updated last month
- A high performance batching router optimises max throughput for text inference workload☆16Sep 6, 2023Updated 2 years ago
- ☆44Updated this week
- ☆20Sep 28, 2024Updated last year
- InfiniTensor 大模型与人工智能系统训练营 CUDA 方向作业与项目系统☆32Feb 6, 2026Updated 3 weeks ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆106Jun 28, 2025Updated 8 months ago
- Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.☆84Updated this week
- A curated list of awesome papers about utilizing large language models for ranking.☆31Oct 30, 2024Updated last year
- PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation☆32Nov 16, 2024Updated last year
- OSDI 2023 Welder, deeplearning compiler☆32Nov 24, 2023Updated 2 years ago
- InfiniStore: an elastic serverless cloud storage system (VLDB'23)☆24May 5, 2023Updated 2 years ago
- ☆71Mar 26, 2025Updated 11 months ago
- ☆32Jul 17, 2024Updated last year
- Ahead of Time (AOT) Triton Math Library☆92Updated this week
- ☆53Updated this week