LeetGPU Solutions
☆114Oct 9, 2025Updated 6 months ago
Alternatives and similar repositories for LeetGPU
Users that are interested in LeetGPU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- High-Performance Cross-Platform GPU-Based Physics Simulator, Based on LuisaCompute☆26Mar 31, 2026Updated 2 weeks ago
- A list of articles outside of the official MLIR docs that I've found useful for learning MLIR☆11Aug 16, 2023Updated 2 years ago
- ☆17Mar 26, 2025Updated last year
- This repository presents the source code for the paper "MILLION: Mastering Long-Context LLM Inference Via Outlier-Immunized KV Product Qu…☆23Apr 2, 2025Updated last year
- ☆40Dec 14, 2025Updated 4 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- SGLang kernel library for NPU☆125Updated this week
- Triton adapter for Ascend. Mirror of https://gitcode.com/ascend/triton-ascend☆119Updated this week
- 🎉My Collections of CUDA Kernels~☆11Jun 25, 2024Updated last year
- Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.☆18Dec 1, 2023Updated 2 years ago
- A docker image for One Student One Chip's debug exam☆10Sep 22, 2023Updated 2 years ago
- ☆11Jun 11, 2023Updated 2 years ago
- ☆10Jun 10, 2023Updated 2 years ago
- Compiler plugin for performance analysis of HIP applications☆13Apr 7, 2025Updated last year
- ☆29Dec 20, 2025Updated 3 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA and CuTe API, Achieve Peak⚡️ Performance.☆150May 10, 2025Updated 11 months ago
- A PyTorch native platform for training generative AI models☆16Nov 18, 2025Updated 5 months ago
- 给NEMU移植Linux Kernel!☆22Jun 1, 2025Updated 10 months ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆31Dec 21, 2024Updated last year
- A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.☆93Updated this week
- The GaussianSplatting Implementation based on LuisaCompute☆18Apr 11, 2026Updated last week
- TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels☆200Updated this week
- Cataloging released Triton kernels.☆302Sep 9, 2025Updated 7 months ago
- Storb is a distributed storage subnet on the Bittensor network☆13Jul 28, 2025Updated 8 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆18Feb 10, 2024Updated 2 years ago
- Shared Middle-Layer for Triton Compilation☆330Dec 5, 2025Updated 4 months ago
- 🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.☆260Feb 13, 2026Updated 2 months ago
- ☆88Mar 21, 2026Updated 3 weeks ago
- GPGPU-Sim 中文注释版代码,包含 GPGPU-Sim 模拟器的最新版代码,经过中文注释,以帮助中文用户更好地理解和使用该模拟器。☆26Dec 18, 2024Updated last year
- Efficient kernel for RMS normalization with fused operations, includes both forward and backward passes, compatibility with PyTorch.☆13Jun 5, 2024Updated last year
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆527Oct 28, 2025Updated 5 months ago
- ☆20Jun 13, 2025Updated 10 months ago
- RISCV C and Triton AI-Benchmark☆24Jan 28, 2026Updated 2 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- FlyDSL is the Python front‑end of the project: Flexible LaYout DSL.☆155Updated this week
- CVPR'24: MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints☆16Jul 4, 2024Updated last year
- Improvement for Modular Camera based Tactile Sensor, with integrated circuit, optimized illumination, and biomimetic markers.☆16Feb 14, 2024Updated 2 years ago
- ☆51May 19, 2025Updated 11 months ago
- The Next-gen Language & Compiler Powering Efficient Hardware Design☆36Jan 16, 2025Updated last year
- A Triton-only attention backend for vLLM☆25Mar 17, 2026Updated last month
- DeeperGEMM: crazy optimized version☆86May 5, 2025Updated 11 months ago