Samples of good AI generated CUDA kernels
☆103May 30, 2025Updated 10 months ago
Alternatives and similar repositories for good-kernels
Users that are interested in good-kernels are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM as World Models using Bayesian inference☆17May 27, 2025Updated 10 months ago
- ☆21May 13, 2022Updated 3 years ago
- TORCH_TRACE parser for PT2☆84Updated this week
- TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators☆121Jun 14, 2025Updated 9 months ago
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆17Mar 26, 2025Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- A collection of GPU experiments and benchmarks for my personal understanding and research.☆28Mar 18, 2026Updated 3 weeks ago
- Official Repo of CudaForge☆76Dec 2, 2025Updated 4 months ago
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)☆916Mar 24, 2026Updated 2 weeks ago
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated 11 months ago
- Code for our paper "Decomposing The Dark Matter of Sparse Autoencoders"☆23Feb 6, 2025Updated last year
- ☆29Updated this week
- Development containers for triton and triton-cpu☆27Updated this week
- EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)☆31Mar 20, 2026Updated 3 weeks ago
- TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels☆198Updated this week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- SDXL GPU cluster scripts☆16Oct 28, 2023Updated 2 years ago
- ☆69Jun 16, 2021Updated 4 years ago
- ☆50Jan 28, 2025Updated last year
- Multichannel Looper/Feedback System for Riffusion☆14May 6, 2023Updated 2 years ago
- This repo contains the benchmarks for Enzyme on GPU's☆11Feb 22, 2026Updated last month
- An insanely secure password manager.☆17Mar 10, 2026Updated last month
- Kernel Fusion and Runtime Compilation Based on NNVM☆73Nov 21, 2016Updated 9 years ago
- Training AI for Super Smash Bros. Melee☆33Mar 27, 2025Updated last year
- HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration☆15Sep 14, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Tile primitives for speedy kernels☆3,304Mar 28, 2026Updated 2 weeks ago
- High-Performance FP32 GEMM on CUDA devices☆118Jan 21, 2025Updated last year
- Allows two LLMs to communicate and run code in the terminal☆28Dec 8, 2024Updated last year
- Python utility to convert PyTorch model weights from '.bin' to '.safetensors' format.☆18Sep 19, 2025Updated 6 months ago
- vLLM Daily Summarization of Merged PRs☆49Updated this week
- Lightweight Llama 3 8B Inference Engine in CUDA C☆54Mar 21, 2025Updated last year
- Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models☆26Sep 14, 2025Updated 6 months ago
- ☆106Mar 6, 2026Updated last month
- An Xposed/LSPosed module for disabling the annoying biometrics timeout☆20Aug 24, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆55Aug 9, 2024Updated last year
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Mar 13, 2023Updated 3 years ago
- ☆15Nov 22, 2025Updated 4 months ago
- Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"☆14Mar 28, 2024Updated 2 years ago
- ☆65Jul 14, 2025Updated 8 months ago
- Modification of SOMPY repo with robust K-means clustering (bootstrapped SSE elbow method)☆13Apr 6, 2019Updated 7 years ago
- ☆12Oct 19, 2014Updated 11 years ago