ucamrl / xrlflow
☆12Updated last year
Related projects: ⓘ
- ☆20Updated last year
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆35Updated 3 months ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆58Updated last year
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆34Updated 6 months ago
- Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs☆21Updated 2 months ago
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆44Updated 11 months ago
- PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization☆18Updated 6 months ago
- Sparse kernels for GNNs based on TVM☆14Updated 3 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆18Updated last year
- one-shot-tuner☆8Updated last year
- ☆12Updated 2 years ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆39Updated 2 years ago
- Distributed Multi-GPU GNN Framework☆35Updated 4 years ago
- Repo for the IISWC 2018 submission☆9Updated 2 years ago
- A simulation framework for modeling efficiency of Graph Neural Network Dataflows☆17Updated last year
- ☆15Updated last week
- ☆9Updated 6 months ago
- Artifact of ASPLOS'23 paper entitled: GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference☆16Updated last year
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆13Updated 5 years ago
- ☆14Updated 2 years ago
- ☆17Updated last year
- ☆71Updated 3 years ago
- ☆19Updated last year
- ☆15Updated 2 years ago
- ☆25Updated 2 years ago
- Code base for OOPSLA'24 paper: UniSparse: An Intermediate Language for General Sparse Format Customization☆28Updated 3 months ago
- PIM-ML is a benchmark for training machine learning algorithms on the UPMEM architecture, which is the first publicly-available real-worl…☆15Updated last year
- ☆30Updated 3 months ago
- ☆14Updated 4 months ago
- ICCAD'23 Best Paper Award candidate: Robust GNN-based Representation Learning for HLS☆9Updated 3 months ago