BoyuanFeng / APNN-TCView external linksLinks
☆19Aug 26, 2021Updated 4 years ago
Alternatives and similar repositories for APNN-TC
Users that are interested in APNN-TC are comparing it to the libraries listed below
Sorting:
- ☆32Aug 24, 2022Updated 3 years ago
- Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"☆40Nov 16, 2021Updated 4 years ago
- ☆50Jun 27, 2019Updated 6 years ago
- Scalable GPU Kernel Fission/Fusion Transformation for Memory-Bound Kernels☆14Aug 26, 2015Updated 10 years ago
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆53Oct 16, 2023Updated 2 years ago
- ☆13Jan 23, 2021Updated 5 years ago
- Experiments evaluating preemption on the NVIDIA Pascal architecture☆17Nov 10, 2016Updated 9 years ago
- POC implementation of "Accelerating HE Operations Using Key Decomposition"[KLSS23]☆18Jun 11, 2025Updated 8 months ago
- ☆23Jan 7, 2022Updated 4 years ago
- ☆22Feb 18, 2025Updated 11 months ago
- ngAP's artifact for ASPLOS'24☆25Jul 29, 2025Updated 6 months ago
- High performance RDMA-based distributed feature collection component for training GNN model on EXTREMELY large graph☆56Jul 3, 2022Updated 3 years ago
- Implementation of the Winograd algorithm.☆24Nov 6, 2018Updated 7 years ago
- ☆26Aug 19, 2022Updated 3 years ago
- Assembler for NVIDIA Volta and Turing GPUs☆238Jan 13, 2022Updated 4 years ago
- this is the release repository of superneurons☆54Feb 13, 2021Updated 5 years ago
- A library of GPU kernels for sparse matrix operations.☆283Nov 24, 2020Updated 5 years ago
- ☆112Jul 3, 2021Updated 4 years ago
- Prefetching and efficient data path for memory disaggregation☆71Jul 16, 2020Updated 5 years ago
- Collections of model quantization algorithms. Any issues, please contact Peng Chen (blueardour@gmail.com)☆73Oct 7, 2021Updated 4 years ago
- Efficient Sparse-Winograd Convolutional Neural Networks (ICLR 2018)☆193May 7, 2019Updated 6 years ago
- ☆33Sep 9, 2020Updated 5 years ago
- Advanced data flow management for distributed Python applications☆36Jan 27, 2026Updated 2 weeks ago
- HPA2021 solution (3rd place)☆10Oct 13, 2021Updated 4 years ago
- Face Verification Example with Flower / Federated Learning☆12Apr 3, 2023Updated 2 years ago
- A Python implementation of the Hopfield network used to solve the traveling salesman problem☆10Apr 11, 2019Updated 6 years ago
- Convolutional Channel-wise Competitive Learning for the Forward-Forward Algorithm. AAAI 2024☆11Jun 27, 2024Updated last year
- The SEAL-CPU backend is a Reference backend engine for HEBench which is a shared library that implements the required functions specified…☆11Mar 3, 2023Updated 2 years ago
- FPGA and GPU acceleration of LeNet5☆35Jul 9, 2019Updated 6 years ago
- TLB Benchmarks☆35Sep 11, 2017Updated 8 years ago
- Fine-grained GPU sharing primitives☆148Jul 28, 2025Updated 6 months ago
- Dorylus: Affordable, Scalable, and Accurate GNN Training☆76May 31, 2021Updated 4 years ago
- pytorch fixed point training tool/framework☆34Oct 14, 2020Updated 5 years ago
- Unified Sparse Library Wrapper Based on cuSPARSE☆12May 24, 2022Updated 3 years ago
- [HPCA 2022] GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design☆39Mar 30, 2022Updated 3 years ago
- RTL implementation of TFlite FPGA accelerator and RISC-V controller. 3D Object Detection based on LiDAR Point Clouds.☆15Mar 12, 2023Updated 2 years ago
- ☆40Feb 28, 2020Updated 5 years ago
- Implementation of the TFHE homomorphic encryption scheme.☆12May 14, 2021Updated 4 years ago
- Datacenter simulation toolkit for the OpenDC project☆10Aug 24, 2020Updated 5 years ago