Artifacts of EVT ASPLOS'24
☆30Mar 6, 2024Updated 2 years ago
Alternatives and similar repositories for EVT_AE
Users that are interested in EVT_AE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10May 12, 2022Updated 4 years ago
- ☆121May 16, 2025Updated last year
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆19Aug 4, 2022Updated 3 years ago
- ☆29Feb 26, 2023Updated 3 years ago
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆41Mar 27, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆17Jan 24, 2024Updated 2 years ago
- Cute layout visualization☆39Jan 18, 2026Updated 4 months ago
- ☆267Jul 11, 2024Updated last year
- ☆105May 31, 2025Updated 11 months ago
- Run OpenCL program on MOBILE GPU (Qualcomm & ARM) !☆18Jun 27, 2018Updated 7 years ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆42Nov 16, 2021Updated 4 years ago
- OpenDNN: An Open-source, cuDNN-like Deep Learning Primitive Library☆29Dec 9, 2019Updated 6 years ago
- parser script to process pytorch autograd profiler result, convert json file to excel.☆15Oct 8, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆13Nov 23, 2024Updated last year
- A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture☆526Jan 15, 2025Updated last year
- Fast GPU based tensor core reductions☆13Jan 13, 2023Updated 3 years ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆33Mar 15, 2021Updated 5 years ago
- Arrow Matrix Decomposition - Communication-Efficient Distributed Sparse Matrix Multiplication☆15Mar 25, 2024Updated 2 years ago
- Shared Middle-Layer for Triton Compilation☆333Dec 5, 2025Updated 5 months ago
- ☆26Dec 5, 2022Updated 3 years ago
- Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.☆10Aug 19, 2023Updated 2 years ago
- development repository for the open earth compiler☆82Feb 19, 2021Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Y…☆47May 22, 2024Updated 2 years ago
- Cleanlab Vizzy: illustrating the core ideas behind the Cleanlab algorithm☆16Apr 19, 2023Updated 3 years ago
- Artifact for OSDI'21 GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.☆71Mar 2, 2023Updated 3 years ago
- ☆24May 9, 2025Updated last year
- ☆18Aug 9, 2025Updated 9 months ago
- A low-cost, high-performance deep learning training framework that enables efficient 100B-scale model fine-tuning on a commodity server w…☆23Mar 21, 2025Updated last year
- [ICML 2025] Adaptive Self-improvement LLM Agentic System for ML Library Development☆17Jan 6, 2026Updated 4 months ago
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19May 12, 2024Updated 2 years ago
- ☆47Jun 18, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Benchmark tests supporting the TiledCUDA library.☆19Nov 19, 2024Updated last year
- UniSparse: An Intermediate Language for General Sparse Format Customization (OOPSLA'24)☆33Nov 12, 2024Updated last year
- Mallacc: Accelerating Memory Allocation☆13Jan 2, 2018Updated 8 years ago
- Official code repository for the papers "Anti-Symmetric DGN: a stable architecture for Deep Graph Networks" accepted at ICLR 2023; "Non-D…☆15Jan 2, 2025Updated last year
- LLVM Call Graph☆28Mar 5, 2021Updated 5 years ago
- FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Data on GPUs☆14Sep 26, 2023Updated 2 years ago
- ☆121May 19, 2025Updated last year