xshaun / sc22-aeLinks
☆14Updated 2 months ago
Alternatives and similar repositories for sc22-ae
Users that are interested in sc22-ae are comparing it to the libraries listed below
Sorting:
- ☆35Updated 3 months ago
- ☆14Updated 4 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Updated 2 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Updated 6 years ago
- An Attention Superoptimizer☆22Updated 11 months ago
- ☆10Updated 2 years ago
- RPCNIC: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator [HPCA2025]☆13Updated last year
- ☆22Updated 10 months ago
- ☆13Updated 2 years ago
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆56Updated last year
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆58Updated last year
- ☆21Updated 3 years ago
- ☆25Updated 2 years ago
- ETHZ Heterogeneous Accelerated Compute Cluster.☆38Updated 3 months ago
- Code released to accompany the ISCA paper: "T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware"☆28Updated 3 years ago
- ☆31Updated 3 years ago
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆25Updated last year
- ☆35Updated 6 months ago
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆14Updated last year
- Exploring CXL on QEMU Emulation☆32Updated 10 months ago
- ☆15Updated last year
- Official resporitory for "IPDPS' 24 QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices".☆20Updated last year
- Artifact of ASPLOS'23 paper entitled: GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference☆19Updated 2 years ago
- ☆25Updated 3 years ago
- ☆15Updated 3 years ago
- ☆40Updated 3 years ago
- Linux source code for ISCA 2020 paper "Enhancing and Exploiting Contiguity for Fast Memory Virtualization"☆20Updated 5 years ago
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆33Updated 11 months ago
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19Updated last year
- ☆19Updated 4 years ago