Fast sparse deep learning on CPUs
☆56Sep 28, 2022Updated 3 years ago
Alternatives and similar repositories for sparsednn
Users that are interested in sparsednn are comparing it to the libraries listed below
Sorting:
- Official implementation of Neurips 2020 "Sparse Weight Activation Training" paper.☆29Jul 23, 2021Updated 4 years ago
- CAKE Library for constant-bandwidth matrix multiplication on CPUs☆14Apr 6, 2024Updated last year
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19May 12, 2024Updated last year
- ☆11Apr 3, 2023Updated 2 years ago
- Implementation of the ACL Findings paper "OutFlip: Generating Examples for Unknown Intent Detection with Natural Language Attack"☆10May 24, 2021Updated 4 years ago
- ☆27Oct 25, 2021Updated 4 years ago
- Multiple 1-stencil implementations using nvidia cuda.☆13Dec 2, 2017Updated 8 years ago
- ☆24May 9, 2025Updated 10 months ago
- Muon fsdp 2☆55Aug 8, 2025Updated 7 months ago
- General Purpose Graphics Processing Unit (GPGPU) IP Core☆11Jul 4, 2014Updated 11 years ago
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆35Feb 26, 2026Updated last week
- Jittor code for Line Drawings for Face Portraits from Photos using Global and Local Structure based GANs (TPAMI)☆14Apr 19, 2021Updated 4 years ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆107Jun 28, 2025Updated 8 months ago
- ☆13Jul 5, 2023Updated 2 years ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆38Dec 10, 2015Updated 10 years ago
- ☆15Dec 16, 2021Updated 4 years ago
- MLPruning, PyTorch, NLP, BERT, Structured Pruning☆20Jun 29, 2021Updated 4 years ago
- ☆16May 11, 2022Updated 3 years ago
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com☆38Dec 1, 2023Updated 2 years ago
- quantize aware training package for NCNN on pytorch☆68Jul 27, 2021Updated 4 years ago
- ☆19Aug 26, 2021Updated 4 years ago
- Sparse-dense matrix-matrix multiplication on GPUs☆14Oct 15, 2018Updated 7 years ago
- Educational C++ code used at SIGGRAPH 2016 Course "Physically Based Sound for Computer Animation and Virtual Environments"☆17May 26, 2018Updated 7 years ago
- [DEPRECATED] Use https://github.com/akemimadoka/Cafe instead☆37Nov 18, 2018Updated 7 years ago
- ☆38Jun 27, 2025Updated 8 months ago
- Fundamental Sources for Water Wave Animation☆20Dec 8, 2022Updated 3 years ago
- This repository contains the results and code for the MLPerf™ Inference v2.1 benchmark.☆18Jul 24, 2025Updated 7 months ago
- Simple dependency injection framework for Python☆21May 15, 2024Updated last year
- 如何做技术演讲(how to give a talk)的slide☆22Feb 8, 2021Updated 5 years ago
- End to End steps for adding custom ops in PyTorch.☆24Aug 20, 2020Updated 5 years ago
- Manually implemented quantization-aware training☆23Oct 12, 2022Updated 3 years ago
- play gemm with tvm☆92Jul 22, 2023Updated 2 years ago
- ☆166Jul 22, 2024Updated last year
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration☆199Apr 27, 2022Updated 3 years ago
- Evaluating different memory managers for dynamic GPU memory☆26Dec 16, 2020Updated 5 years ago
- Fast matrix multiplication for few-bit integer matrices on CPUs.☆28Mar 19, 2019Updated 6 years ago
- CUDA project for uni subject☆26Oct 26, 2020Updated 5 years ago
- Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"☆25Feb 24, 2023Updated 3 years ago
- XuanTie vendor extension Instruction Set spec☆44May 30, 2025Updated 9 months ago