GustavoStahl / CASSLinks
CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark
☆27Updated 2 months ago
Alternatives and similar repositories for CASS
Users that are interested in CASS are comparing it to the libraries listed below
Sorting:
- Artifact evaluation of PLDI'24 paper "Allo: A Programming Model for Composable Accelerator Design"☆28Updated last year
- HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs☆35Updated 8 months ago
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆53Updated last year
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆38Updated 5 months ago
- Tile-based language built for AI computation across all scales☆48Updated this week
- WaferLLM: Large Language Model Inference at Wafer Scale☆49Updated last month
- Artifacts of EVT ASPLOS'24☆26Updated last year
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆61Updated last year
- ☆27Updated 2 months ago
- ☆19Updated 11 months ago
- ☆17Updated 5 months ago
- ☆28Updated 5 months ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆27Updated 8 months ago
- GPGPU-Sim 中文注释版代码,包含 GPGPU-Sim 模拟器的最新版代码,经过中文注释,以帮助中文用户更好地理解和使用该模拟器。☆23Updated 8 months ago
- ☆55Updated 3 months ago
- PTX-EMU is a simple emulator for CUDA program.☆34Updated 4 months ago
- Asynchronous semantics for architectural simulation and synthesis.☆48Updated 3 weeks ago
- ☆105Updated this week
- H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference☆53Updated 4 months ago
- Domain-Specific Architecture Generator 2☆21Updated 2 years ago
- Artifact for "DX100: A Programmable Data Access Accelerator for Indirection (ISCA 2025)" paper☆13Updated 4 months ago
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆115Updated 2 years ago
- Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"☆26Updated 2 years ago
- Artifact for paper "PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference", ASPLOS 2025☆87Updated 4 months ago
- EQueue Dialect☆41Updated 3 years ago
- ☆181Updated last year
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs☆57Updated 5 months ago
- GPGPU-SIM 使用篇☆14Updated 2 years ago
- ☆14Updated 3 years ago
- ArchExplorer: Microarchitecture Exploration Via Bottleneck Analysis☆34Updated last year