flashinfer-ai/mlsys26-agent-baseline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/flashinfer-ai/mlsys26-agent-baseline)

flashinfer-ai / mlsys26-agent-baseline

☆33

Alternatives and similar repositories for mlsys26-agent-baseline

Users that are interested in mlsys26-agent-baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

flashinfer-ai / flashinfer-bench-starter-kit
View on GitHub
FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels
☆178Apr 26, 2026Updated 3 months ago
romitjain / kachua-mlsys
View on GitHub
[MLSys 26] 🥇 Solution for Gated Delta Net Track of MLSys 26 Flash infer competition
☆35May 22, 2026Updated 2 months ago
RLsys-Foundation / TritonForge
View on GitHub
🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…
☆146Nov 10, 2025Updated 8 months ago
SiriusNEO / StarAgent
View on GitHub
Lightweight agent multiplexer, all in one Web dashboard
☆54Updated this week
meta-pytorch / KernelAgent
View on GitHub
Autonomous GPU Kernel Generation & Optimization via Deep Agents
☆490Jul 15, 2026Updated last week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
flashinfer-ai / flashinfer-bench
View on GitHub
Building the Virtuous Cycle for AI-driven LLM Systems
☆261May 1, 2026Updated 2 months ago
caoshiyi / K-Search
View on GitHub
Automated High-Performance GPU Kernel Generation
☆120Jun 1, 2026Updated last month
AIS-SNU / GraNNDis_Artifact
View on GitHub
[PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and min…
☆10Aug 13, 2024Updated last year
vllm-project / tml-fa4
View on GitHub
FA4-based Relative Attention Kernel developed by TML and Colfax
☆17Jul 17, 2026Updated last week
hongsunjang / pipe-bd
View on GitHub
[DATE 2023] Pipe-BD: Pipelined Parallel Blockwise Distillation
☆12Jul 13, 2023Updated 3 years ago
HabanaAI / Megatron-DeepSpeed
View on GitHub
Intel Gaudi's Megatron DeepSpeed Large Language Models for training
☆18Dec 19, 2024Updated last year
Dogacel / auto-gpu-kernel
View on GitHub
Winner 🏆 (Agent-only) MLSys 2026 - FlashInfer AI Kernel Generation Contest for the DeepSeek Sparse Attention (DSA) track with an average…
☆148Jun 10, 2026Updated last month
tile-ai / AttentionEngine
View on GitHub
☆52May 19, 2025Updated last year
HydraQYH / expert_specialization_moe
View on GitHub
Expert Specialization MoE Solution based on CUTLASS
☆27Apr 14, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
tile-ai / TileFoundry
View on GitHub
☆54Updated this week
YJMSTR / flash-linear-attention
View on GitHub
FLA but cuTile
☆27Apr 17, 2026Updated 3 months ago
microsoft / RetrievalAttention
View on GitHub
[VLDB 26, NeurIPS 25] Scalable long-context LLM decoding that leverages sparsity—by treating the KV cache as a vector storage system.
☆149Feb 22, 2026Updated 5 months ago
yonsei-hpcp / pid-join
View on GitHub
☆12May 8, 2025Updated last year
alan-hpc / cuda_op_benchmark
View on GitHub
方便扩展的Cuda算子理解和优化框架，仅用在学习使用
☆18Jun 13, 2024Updated 2 years ago
KuangjuX / cuda-evolve-oss
View on GitHub
Autonomous GPU kernel optimization system driven by AI agents.
☆31Mar 29, 2026Updated 3 months ago
tile-ai / tilescale
View on GitHub
Tile-based language built for AI computation across all scales
☆176Jun 16, 2026Updated last month
TongmingLAIC / AKO4ALL
View on GitHub
Agentic Kernel Optimization for All — automated GPU kernel optimization for any kernel, any hardware, any language
☆328May 31, 2026Updated last month
UT-InfraAI / cuco
View on GitHub
An agent for CUDA compute-communication kernel co-design
☆35May 7, 2026Updated 2 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
iamkanghyunchoi / falqon
View on GitHub
Official repository of paper [FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic, NeurIPS 2025]
☆21Dec 2, 2025Updated 7 months ago
hongsunjang / HILOS
View on GitHub
[ASPLOS'26] HILOS: A Cost-Effective Near-Storage Processing Solution for Offline Inference of Long-Context LLMs
☆20Jan 18, 2026Updated 6 months ago
hao-ai-lab / DistCA
View on GitHub
Efficient Long-context Language Model Training by Core Attention Disaggregation
☆106Apr 7, 2026Updated 3 months ago
igamenovoer / houmao
View on GitHub
A framework and CLI toolkit for orchestrating teams of loosely-coupled AI agents.
☆18Updated this week
seb-v / amd_challenge_solutions
View on GitHub
☆19Jun 6, 2025Updated last year
AIS-SNU / PathWeaver
View on GitHub
A High-Throughput Multi-GPU System for Graph-Based Approximate Nearest Neighbor Search
☆21Jul 22, 2025Updated last year
zhuzilin / flash-attention-with-sink
View on GitHub
☆37Aug 7, 2025Updated 11 months ago
cornell-zhang / llm-datatypes
View on GitHub
Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
☆27Jun 25, 2024Updated 2 years ago
NVlabs / SOLAR
View on GitHub
Speed of Light Analysis for ML Model Runtime
☆108Jun 10, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
togethercomputer / ParallelKernelBench
View on GitHub
☆44Jul 1, 2026Updated 3 weeks ago
Dao-AILab / AI-workflow
View on GitHub
☆71Mar 24, 2026Updated 4 months ago
HanGuo97 / hilt
View on GitHub
☆40Dec 14, 2025Updated 7 months ago
BonnieW05 / KernelBenchX
View on GitHub
KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels
☆34Jun 1, 2026Updated last month
TongmingLAIC / AKO4X
View on GitHub
Agentic Kernel Optimization — advanced & eXtensible: a closed-loop, campaign-based multi-agent system for optimizing GPU kernels (benchma…
☆61May 31, 2026Updated last month
aikitoria / nanotrace
View on GitHub
Low overhead tracing library and trace visualizer for pipelined CUDA kernels
☆137Jul 17, 2026Updated last week
uservan / speculative_thinking
View on GitHub
☆34Oct 13, 2025Updated 9 months ago