LeetGPU Solutions
☆111Oct 9, 2025Updated 4 months ago
Alternatives and similar repositories for LeetGPU
Users that are interested in LeetGPU are comparing it to the libraries listed below
Sorting:
- SGLang kernel library for NPU☆101Updated this week
- A list of articles outside of the official MLIR docs that I've found useful for learning MLIR☆11Aug 16, 2023Updated 2 years ago
- Dynamic Telegram Trading Bot☆16Feb 21, 2025Updated last year
- 给NEMU移植Linux Kernel!☆22Jun 1, 2025Updated 9 months ago
- Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.☆18Dec 1, 2023Updated 2 years ago
- ☆39Dec 14, 2025Updated 2 months ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆32Dec 21, 2024Updated last year
- GPGPU-Sim 中文注释版代码,包含 GPGPU-Sim 模拟器的最新版代码,经过中文注释,以帮助中文用户更好地理解和使用该模拟器。☆28Dec 18, 2024Updated last year
- OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection(ICCAD 2024)☆29Oct 20, 2024Updated last year
- Canvas: End-to-End Kernel Architecture Search in Neural Networks☆27Nov 18, 2024Updated last year
- cuJSON: A Highly Parallel JSON Parser for GPUs☆40Dec 12, 2025Updated 2 months ago
- Triton adapter for Ascend. Mirror of https://gitcode.com/ascend/triton-ascend☆113Updated this week
- ☆111Updated this week
- ☆42Nov 1, 2025Updated 4 months ago
- ☆29Dec 20, 2025Updated 2 months ago
- The Next-gen Language & Compiler Powering Efficient Hardware Design☆36Jan 16, 2025Updated last year
- DeeperGEMM: crazy optimized version☆74May 5, 2025Updated 10 months ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆507Oct 28, 2025Updated 4 months ago
- TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels☆196Updated this week
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆34Feb 10, 2025Updated last year
- Fast and memory-efficient exact kmeans☆140Feb 18, 2026Updated 2 weeks ago
- SUSTech CS202 (Computer Organization) Project, with CPU hardware implemented in Chisel(Scala) and software cross-compiled from Rust.☆34Jun 16, 2023Updated 2 years ago
- ☆10Oct 24, 2021Updated 4 years ago
- 🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.☆251Feb 13, 2026Updated 3 weeks ago
- Luthier, a GPU binary instrumentation tool for AMD GPUs☆27Feb 21, 2026Updated last week
- ☆43Mar 31, 2025Updated 11 months ago
- Collection of kernels written in Triton language☆178Jan 27, 2026Updated last month
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆165Feb 11, 2026Updated 3 weeks ago
- Framework to reduce autotune overhead to zero for well known deployments.☆97Sep 19, 2025Updated 5 months ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆106Jun 28, 2025Updated 8 months ago
- ☄☄彗星密码本,基于Taro的微信小程序☆11Aug 18, 2021Updated 4 years ago
- ☆12May 20, 2019Updated 6 years ago
- WaferLLM: Large Language Model Inference at Wafer Scale☆90Jan 7, 2026Updated last month
- Hodge podge random stuff☆10Jan 20, 2017Updated 9 years ago
- Community maintained hardware plugin for vLLM on AWS Neuron☆23Feb 26, 2026Updated last week
- amdgpu example code in hip/asm☆55Updated this week
- ☆10Jul 23, 2021Updated 4 years ago
- 30 Days of Airflow☆10Aug 13, 2019Updated 6 years ago
- Simulate short-reads datasets using probabilistic models☆11Jun 1, 2013Updated 12 years ago