CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning
☆297Nov 3, 2025Updated 5 months ago
Alternatives and similar repositories for CUDA-L1
Users that are interested in CUDA-L1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆94Nov 22, 2025Updated 5 months ago
- Automated bottleneck detection and solution orchestration☆20Feb 24, 2026Updated 2 months ago
- Official Repo of CudaForge☆76Dec 2, 2025Updated 4 months ago
- EuroSys '24: "Trinity: A Fast Compressed Multi-attribute Data Store"☆18Mar 8, 2025Updated last year
- AI integration plugin for Unreal Engine 5 using Large Language Models☆26May 11, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆27Mar 4, 2025Updated last year
- ☆11Mar 11, 2025Updated last year
- TensorRT Acceleration for PyTorch Native Eager Mode Quantization Models☆17Jul 22, 2024Updated last year
- ☆19Feb 25, 2024Updated 2 years ago
- A Triton-only attention backend for vLLM☆25Mar 17, 2026Updated last month
- Source code for paper Are Human-generated Demonstrations Necessary for In-context Learning☆12Jan 21, 2024Updated 2 years ago
- This repository provides tutorial, which discusses running sample publisher and subscriber using multiple transports of point_cloud_trans…☆11Mar 17, 2026Updated last month
- Source code for Activated LoRA☆25Nov 22, 2025Updated 5 months ago
- Autonomous GPU Kernel Generation & Optimization via Deep Agents☆384Updated this week
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Automating analysis from trace files☆66Updated this week
- Adapter and benchmark hub for solid-state LiDAR across LIO/LVIO/SLAM, with robust handling for small-FoV short-range and degenerate scena…☆29Feb 8, 2026Updated 2 months ago
- Official Repository for Task-Circuit Quantization☆25Jun 1, 2025Updated 10 months ago
- DFloat11 [NeurIPS '25]: Lossless Compression of LLMs and DiTs for Efficient GPU Inference☆622Nov 24, 2025Updated 5 months ago
- ☆94Oct 30, 2025Updated 6 months ago
- ☆65Jul 14, 2025Updated 9 months ago
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆29Oct 23, 2025Updated 6 months ago
- TurboQuant for GGML: 4.57x KV Cache Compression with 72K+ Context for Llama-3.3-70B on Consumer GPUs. …☆42Mar 28, 2026Updated last month
- A benchmark of real-world DL kernel problems☆181Apr 15, 2026Updated 2 weeks ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- IntrinsiX: High-Quality PBR Generation using Image Priors☆58Apr 11, 2026Updated 2 weeks ago
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- ☆23Aug 26, 2024Updated last year
- ☆20Mar 25, 2025Updated last year
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 9 months ago
- 🚀 LLM-I: Transform LLMs into natural interleaved multimodal creators! ✨ Tool-use framework supporting image search, generation, code ex…☆40Oct 20, 2025Updated 6 months ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆361Jun 23, 2025Updated 10 months ago
- ☆51Sep 3, 2025Updated 7 months ago
- The official code release for Q#: Provably Optimal Distributional RL for LLM Post-Training☆19Mar 4, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ECCV2024] Immunizing text-to-image Models against Malicious Adaptation☆18Jan 17, 2025Updated last year
- Domain-specific framework for performance analysis of parallel programs☆25Mar 23, 2026Updated last month
- An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace☆18Oct 21, 2024Updated last year
- Code and data for paper "(How) do Language Models Track State?"☆22Mar 31, 2025Updated last year
- Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training☆39Jun 20, 2025Updated 10 months ago
- Mirage Persistent Kernel: Compiling LLMs into a MegaKernel☆2,218Apr 19, 2026Updated last week
- A thorough survey of SLAM in dynamic environments: covering topics of front-end odometry / loop-closure / mapping, single-session / long-…☆16May 25, 2024Updated last year