Review automated kernel generation in the era of LLMs
☆165Apr 7, 2026Updated last week
Alternatives and similar repositories for awesome-LLM-driven-kernel-generation
Users that are interested in awesome-LLM-driven-kernel-generation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang…☆238Updated this week
- Building the Virtuous Cycle for AI-driven LLM Systems☆217Updated this week
- Multi-target compiler for Sum-Product Networks, based on MLIR and LLVM.☆25Nov 29, 2024Updated last year
- Repository for AI model benchmarking on TT-Buda☆16Feb 9, 2026Updated 2 months ago
- Buda Compiler Backend for Tenstorrent devices☆31Apr 2, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code repository for the research paper "A Systematic Look at Ciphertext Side Channels on AMD SEV-SNP"☆13May 17, 2022Updated 3 years ago
- SICP Online Judge, consisting of a server, a react web interface and a modified Ok client.☆12Dec 5, 2022Updated 3 years ago
- ☆19Jun 3, 2023Updated 2 years ago
- FlagGems is an operator library for large language models implemented in the Triton Language.☆953Updated this week
- This is a GIT syncronization of https://wiki.newae.com☆10Feb 21, 2018Updated 8 years ago
- Flash Attention in raw Cuda C beating PyTorch☆38May 14, 2024Updated last year
- B站-数电的ppt☆11Feb 19, 2024Updated 2 years ago
- An attempt to migrate Karpathy's llm.c to safe rust.☆13Jun 4, 2024Updated last year
- Implementation of the Reusable Enclaves paper☆14Sep 25, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- 南京大学ICS2019 PA实验, 实验手册https://nju-projectn.github.io/ics-pa-gitbook/ics2019/☆10Aug 22, 2020Updated 5 years ago
- MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models☆28Apr 2, 2026Updated 2 weeks ago
- The MiniAgents visualization tool for simulacra.☆18Apr 18, 2024Updated last year
- 南京大学 计算机科学与技术系2019 计算机系统基础PA☆14Sep 18, 2020Updated 5 years ago
- Autonomous GPU Kernel Generation & Optimization via Deep Agents☆362Apr 9, 2026Updated last week
- [IJCAI 2024] CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning☆25Feb 1, 2024Updated 2 years ago
- Code for "AtTGen: Attribute Tree Generation for Real-World Attribute Joint Extraction", ACL 2023☆13May 19, 2023Updated 2 years ago
- FlagScale is a large model toolkit based on open-sourced projects.☆500Updated this week
- Official completion of “Training on the Benchmark Is Not All You Need”.☆40Dec 31, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.☆105Dec 17, 2025Updated 3 months ago
- qwen-nsa☆87Oct 14, 2025Updated 6 months ago
- Software implementation of Gaussian sampling algorithms in C++ using NTL library.☆14Nov 6, 2015Updated 10 years ago
- ☆13May 12, 2025Updated 11 months ago
- KAIST Educational Virtualization☆16Mar 6, 2026Updated last month
- Cavs: An Efficient Runtime System for Dynamic Neural Networks☆15Sep 18, 2020Updated 5 years ago
- Using Malicious #VC Interrupts to Break AMD SEV-SNP (IEEE S&P 2024)☆26Apr 22, 2024Updated last year
- ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents, NeurIPS 2025☆35Nov 15, 2025Updated 5 months ago
- Blindspots in LLMs I've noticed while AI coding. Sonnet family emphasis.☆13Mar 20, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆13Jul 7, 2017Updated 8 years ago
- diffusers with search engine☆12Jan 13, 2026Updated 3 months ago
- CUDA keyring packaging for Debian☆14Apr 14, 2023Updated 3 years ago
- ☆52Apr 30, 2025Updated 11 months ago
- Tile-Based Runtime for Ultra-Low-Latency LLM Inference☆704Mar 8, 2026Updated last month
- ☆14Mar 8, 2025Updated last year
- ☆11Jan 7, 2019Updated 7 years ago