CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning
☆294Nov 3, 2025Updated 4 months ago
Alternatives and similar repositories for CUDA-L1
Users that are interested in CUDA-L1 are comparing it to the libraries listed below
Sorting:
- ☆91Nov 22, 2025Updated 3 months ago
- Automated bottleneck detection and solution orchestration☆19Feb 24, 2026Updated 3 weeks ago
- EuroSys '24: "Trinity: A Fast Compressed Multi-attribute Data Store"☆19Mar 8, 2025Updated last year
- 2025年科学院+工程院院士候选人负面网络舆情☆39Sep 6, 2025Updated 6 months ago
- AI integration plugin for Unreal Engine 5 using Large Language Models☆24May 11, 2025Updated 10 months ago
- ☆11Mar 11, 2025Updated last year
- CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning☆484Jan 8, 2026Updated 2 months ago
- Paper-reading notes for Berkeley OS prelim exam.☆14Aug 28, 2024Updated last year
- TensorRT Acceleration for PyTorch Native Eager Mode Quantization Models☆17Jul 22, 2024Updated last year
- A Triton-only attention backend for vLLM☆24Feb 11, 2026Updated last month
- ☆139Aug 18, 2025Updated 7 months ago
- A lightweight, general-purpose framework for evaluating GPU kernel correctness and performance.☆45Updated this week
- Source code for paper Are Human-generated Demonstrations Necessary for In-context Learning☆12Jan 21, 2024Updated 2 years ago
- Source code for Activated LoRA☆24Nov 22, 2025Updated 3 months ago
- Autonomous GPU Kernel Generation & Optimization via Deep Agents☆309Mar 10, 2026Updated last week
- This repository provides tutorial, which discusses running sample publisher and subscriber using multiple transports of point_cloud_trans…☆10Updated this week
- CodeEvolve is an open-source evolutionary coding agent for algorithm discovery and optimization.☆65Updated this week
- Automating analysis from trace files☆63Mar 13, 2026Updated last week
- Adapter and benchmark hub for solid-state LiDAR across LIO/LVIO/SLAM, with robust handling for small-FoV short-range and degenerate scena…☆28Feb 8, 2026Updated last month
- DFloat11 [NeurIPS '25]: Lossless Compression of LLMs and DiTs for Efficient GPU Inference☆615Nov 24, 2025Updated 3 months ago
- Official Repository for Task-Circuit Quantization☆24Jun 1, 2025Updated 9 months ago
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)☆869Mar 9, 2026Updated last week
- tbb, gpu things for robotics☆13Sep 30, 2024Updated last year
- ☆91Oct 30, 2025Updated 4 months ago
- Code for data-aware compression of DeepSeek models☆71Dec 11, 2025Updated 3 months ago
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆27Oct 23, 2025Updated 4 months ago
- learn TensorRT from scratch🥰☆17Sep 29, 2024Updated last year
- ☆64Jul 14, 2025Updated 8 months ago
- NextCoder: Robust Adaptation of Code LMs to Diverse Code Edits (ICML'25)☆43Jul 9, 2025Updated 8 months ago
- The official code release for Q#: Provably Optimal Distributional RL for LLM Post-Training☆18Mar 4, 2025Updated last year
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆140Aug 13, 2025Updated 7 months ago
- IntrinsiX: High-Quality PBR Generation using Image Priors☆52Dec 8, 2025Updated 3 months ago
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- ☆23Aug 26, 2024Updated last year
- ☆20Mar 25, 2025Updated 11 months ago
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 8 months ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆361Jun 23, 2025Updated 8 months ago
- 🚀 LLM-I: Transform LLMs into natural interleaved multimodal creators! ✨ Tool-use framework supporting image search, generation, code ex…☆41Oct 20, 2025Updated 5 months ago
- ☆47Sep 3, 2025Updated 6 months ago