☆10Apr 24, 2023Updated 2 years ago
Alternatives and similar repositories for Unified-Convolution-Framework
Users that are interested in Unified-Convolution-Framework are comparing it to the libraries listed below
Sorting:
- FLOPS counter for all your GPU benchmarking needs☆13Aug 8, 2024Updated last year
- GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU☆24Mar 27, 2025Updated 11 months ago
- Experimental GPU language with meta-programming☆27Sep 6, 2024Updated last year
- A Python based programming system for heterogeneous computing☆25Apr 29, 2025Updated 10 months ago
- Eloquent is a Typora theme designed for writing technical books. It's minimal and easy to use!☆13Sep 25, 2021Updated 4 years ago
- ☆32Aug 24, 2022Updated 3 years ago
- Kite: Architecture Simulator for RISC-V Instruction Set☆20Jan 2, 2025Updated last year
- Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.☆13Nov 3, 2023Updated 2 years ago
- ☆46Jun 19, 2024Updated last year
- ☆22Aug 14, 2024Updated last year
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆92Nov 23, 2022Updated 3 years ago
- ☆13Dec 23, 2025Updated 2 months ago
- 华为集合通信性能测试☆15May 27, 2024Updated last year
- ☆13Nov 25, 2019Updated 6 years ago
- A direct convolution library targeting ARM multi-core CPUs.☆12Nov 27, 2024Updated last year
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆55Oct 16, 2023Updated 2 years ago
- Implementation of various equivariant models in JAX☆12Apr 12, 2024Updated last year
- Torch-native C++/CUDA library to accelerate tensor-product layers in MLIPs☆55Nov 26, 2025Updated 3 months ago
- Minimal repository to demonstrate fast LoRA inference with Flux family of models.☆28Jul 23, 2025Updated 7 months ago
- ☆19Sep 10, 2025Updated 6 months ago
- PaiNN in jax☆11Jan 14, 2025Updated last year
- Yet another Polyhedra Compiler for DeepLearning☆19Apr 14, 2023Updated 2 years ago
- Automatic Bootstrapping Management Compiler for FHE☆25Mar 26, 2025Updated 11 months ago
- Artifact repository for paper Automatic Generation of High-Performance Quantized Machine Learning Kernels☆17Oct 13, 2020Updated 5 years ago
- ☆50Jun 27, 2019Updated 6 years ago
- Contributions to Playwright for .NET 🎭🧪☆12Nov 20, 2023Updated 2 years ago
- devector and batch_deque containers for C++. See more at: http://erenon.hu/double_ended☆15Oct 7, 2017Updated 8 years ago
- GPU implementation of Winograd convolution☆10Oct 23, 2017Updated 8 years ago
- Show effects of over-subscription and ways to fix that☆16Aug 15, 2024Updated last year
- ☆14Nov 29, 2011Updated 14 years ago
- The code repository of DGCNN on FPGA: Acceleration of The Point Cloud Classifier Using FPGAs☆17Mar 6, 2023Updated 3 years ago
- Fast GPU based tensor core reductions☆13Jan 13, 2023Updated 3 years ago
- Benchmarking scripts for Gaia☆14Apr 10, 2025Updated 11 months ago
- Performant kernels for symmetric tensors☆16Aug 22, 2024Updated last year
- ☆13Jun 2, 2024Updated last year
- Tensor Parallelism with JAX + Shard Map☆11Sep 29, 2023Updated 2 years ago
- A terminal-based citation generator☆14Jan 21, 2023Updated 3 years ago
- Source code of the IPDPS '21 paper: "TileSpMV: A Tiled Algorithm for Sparse Matrix-Vector Multiplication on GPUs" by Yuyao Niu, Zhengyang…☆12Aug 12, 2022Updated 3 years ago
- E(n) Equivariant GNN in jax☆14Aug 31, 2023Updated 2 years ago