Free resource for the book AI Compiler Development Guide
☆50Dec 22, 2022Updated 3 years ago
Alternatives and similar repositories for AI_compiler_development_guide
Users that are interested in AI_compiler_development_guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Start AI Compiler☆51Feb 26, 2026Updated 3 months ago
- A Rust implementation of AUTOSTAR's Scalable service-Oriented MiddlewarE over IP (SOME/IP).☆13Dec 17, 2025Updated 5 months ago
- An MLIR-based toy DL compiler for TVM Relay.☆62Oct 16, 2022Updated 3 years ago
- a simple general program language☆99Feb 2, 2026Updated 4 months ago
- 晚上下班不刷手机,学点什么。系列一:CUDA 计算框架 CUFX (Cuda Framework eXtended)。☆17Dec 15, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- DELTA-pytorch:DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation☆12Apr 16, 2024Updated 2 years ago
- ☆11Nov 25, 2020Updated 5 years ago
- ☆12Apr 30, 2024Updated 2 years ago
- IMPORTANT NOTICE: This implementation is long outdated. Whole-Function Vectorization is an algorithm that transforms a scalar function in…☆23May 16, 2012Updated 14 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Mar 13, 2023Updated 3 years ago
- My study note for mlsys☆14Nov 4, 2024Updated last year
- Tencent Distribution of TVM☆16Apr 7, 2023Updated 3 years ago
- tutorial for writing custom pytorch cpp+cuda kernel, applied on volume rendering (NeRF)☆29Dec 12, 2023Updated 2 years ago
- Hands-On Practical MLIR Tutorial☆791Oct 20, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- play gemm with tvm☆91Jul 22, 2023Updated 2 years ago
- ☆12Oct 29, 2020Updated 5 years ago
- ☆59May 28, 2026Updated 2 weeks ago
- a simple x86/arm jit framework for c☆37Mar 2, 2026Updated 3 months ago
- 使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention☆81Aug 12, 2024Updated last year
- Open source of the paper "击败SOTA反混淆方法"☆16Sep 10, 2022Updated 3 years ago
- Code for ICML 2020 paper: Do RNN and LSTM have Long Memory?☆17Jan 6, 2021Updated 5 years ago
- Consuming Resrouce via Auto-generation for LLM-DoS Attack under Black-box Settings☆21Sep 1, 2025Updated 9 months ago
- A Fuzzy Decision Tree implementation for Python.☆22Feb 21, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Embedded Universal DSL: a good DSL for us, by us☆75Updated this week
- A tiny Debugger : - )☆10Jan 24, 2021Updated 5 years ago
- ☆19Apr 28, 2021Updated 5 years ago
- ☆19Sep 17, 2021Updated 4 years ago
- GEMM☆10Aug 26, 2023Updated 2 years ago
- row-major matmul optimization☆730May 14, 2026Updated last month
- ☆36Dec 12, 2021Updated 4 years ago
- A JADX plugin for interactive code analysis using Large Language Models (LLMs). Provides dynamic code analysis, security assessment, malw…☆27Dec 14, 2024Updated last year
- The tutorial of HuggingFace transformers.☆15Aug 14, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A template for using Rust inside Unity as a native plug-in.☆17Aug 17, 2022Updated 3 years ago
- Quantized Attention on GPU☆44Nov 22, 2024Updated last year
- ☆11May 16, 2026Updated 3 weeks ago
- A Winograd Minimal Filter Implementation in CUDA☆30Aug 25, 2021Updated 4 years ago
- Use https://ctags.io instead (This was fork of http://ctags.sourceforge.net/)☆25Jul 25, 2015Updated 10 years ago
- ☆14Jan 10, 2024Updated 2 years ago
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆47Jun 11, 2025Updated last year