Horizontal Fusion
☆24Jan 7, 2022Updated 4 years ago
Alternatives and similar repositories for HFuse
Users that are interested in HFuse are comparing it to the libraries listed below
Sorting:
- ☆18Mar 4, 2025Updated last year
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago
- GPU Performance Advisor☆66Jul 25, 2022Updated 3 years ago
- An efficient concurrent graph processing system☆46Oct 27, 2021Updated 4 years ago
- An Optimizing Compiler for Recommendation Model Inference☆26Jun 5, 2025Updated 9 months ago
- Evaluating different memory managers for dynamic GPU memory☆26Dec 16, 2020Updated 5 years ago
- Build your own S3-Select in 400 lines of Rust!☆14Mar 23, 2025Updated 11 months ago
- Scalable GPU Kernel Fission/Fusion Transformation for Memory-Bound Kernels☆14Aug 26, 2015Updated 10 years ago
- Unleash the performance potential of your Parquet files.☆44Feb 24, 2026Updated last week
- ☆14Apr 24, 2024Updated last year
- A pattern-based algorithmic autotuner for graph processing on GPUs.☆32Jun 25, 2025Updated 8 months ago
- ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines (FPGA 2025 Best Paper Nominee)☆59Feb 24, 2026Updated last week
- A tool for examining GPU scheduling behavior.☆95Aug 17, 2024Updated last year
- ☆33Sep 9, 2020Updated 5 years ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆59Oct 3, 2022Updated 3 years ago
- Spack package repository maintained by Student Cluster Competition Team @ Sun Yat-sen University.☆16Aug 20, 2025Updated 6 months ago
- GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving☆20Jul 30, 2025Updated 7 months ago
- ☆31Updated this week
- [HPCA 2022] GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design☆39Mar 30, 2022Updated 3 years ago
- ☆48Jul 13, 2024Updated last year
- ANT-ACE: Advanced Compiler Ecosystem for Fully Homomorphic Encryption and Domain Specific Computing☆56Updated this week
- ☆19Nov 21, 2022Updated 3 years ago
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆121Oct 26, 2022Updated 3 years ago
- Rebuild YatSenOS On RISC-V 64.☆22Jan 6, 2022Updated 4 years ago
- ☆20Sep 28, 2024Updated last year
- ☆17Jan 24, 2024Updated 2 years ago
- ☆81Nov 16, 2020Updated 5 years ago
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆41Mar 17, 2024Updated last year
- ☆48Jan 30, 2026Updated last month
- ☆26Oct 6, 2023Updated 2 years ago
- ☆50Jun 27, 2019Updated 6 years ago
- Opara is a lightweight and resource-aware DNN Operator parallel scheduling framework to accelerate the execution of DNN inference on GPUs…☆23Dec 19, 2024Updated last year
- ngAP's artifact for ASPLOS'24☆25Jul 29, 2025Updated 7 months ago
- ☆26Dec 22, 2024Updated last year
- Source code of the SC '23 paper: "DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multipli…☆28Jun 18, 2024Updated last year
- UniSparse: An Intermediate Language for General Sparse Format Customization (OOPSLA'24)☆33Nov 12, 2024Updated last year
- Source code of the simulator used in the Mosaic paper from MICRO 2017: "Mosaic: A GPU Memory Manager with Application-Transparent Support…☆50Aug 21, 2018Updated 7 years ago
- ☆28Aug 14, 2024Updated last year
- Source code for the paper "Profile Guided Optimization without Profiles: A Machine Learning Approach"☆26Dec 30, 2021Updated 4 years ago