☆243Nov 9, 2022Updated 3 years ago
Alternatives and similar repositories for NM-sparsity
Users that are interested in NM-sparsity are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆19Dec 10, 2021Updated 4 years ago
- Code for ICML 2021 submission☆35Mar 24, 2021Updated 5 years ago
- Pytorch implementation of our paper accepted by NeurIPS 2022 -- Learning Best Combination for Efficient N:M Sparsity☆22Jan 13, 2023Updated 3 years ago
- ☆10Feb 1, 2022Updated 4 years ago
- ☆30Nov 23, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"☆13Jun 7, 2023Updated 2 years ago
- Efficient 2:4 sparse training algorithms and implementations☆59Dec 8, 2024Updated last year
- Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".☆878Aug 20, 2024Updated last year
- Code for "Accelerating Transformer Pre-training with 2:4 Sparsity"☆27Dec 8, 2024Updated last year
- Code for "Fast Sparse ConvNets" CVPR2020 submissions☆12Nov 20, 2019Updated 6 years ago
- Spartan is an algorithm for training sparse neural network models. This repository accompanies the paper "Spartan Differentiable Sparsity…☆25Oct 31, 2022Updated 3 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆144Mar 31, 2023Updated 3 years ago
- Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020☆21Nov 15, 2020Updated 5 years ago
- To appear in the 11th International Conference on Learning Representations (ICLR 2023).☆18Feb 24, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆28Oct 21, 2020Updated 5 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated last month
- This repo contains the code for studying the interplay between quantization and sparsity methods☆26Feb 26, 2025Updated last year
- Implementation of NM sparsity recipe presented in the paper "Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers".☆11Feb 5, 2024Updated 2 years ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆460May 15, 2023Updated 2 years ago
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs☆64Mar 25, 2025Updated last year
- An implementation of <Group Fisher Pruning for Practical Network Compression> based on pytorch and mmcv☆18Nov 21, 2021Updated 4 years ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"☆81Jul 7, 2025Updated 9 months ago
- Domain-Specific Architecture Generator 2☆23Oct 2, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.☆14Feb 3, 2025Updated last year
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆130Jul 11, 2023Updated 2 years ago
- Post-training sparsity-aware quantization☆34Feb 26, 2023Updated 3 years ago
- A simple cycle-accurate DaDianNao simulator☆13Mar 27, 2019Updated 7 years ago
- Code accompanying the NeurIPS 2020 paper: WoodFisher (Singh & Alistarh, 2020)☆53Mar 8, 2021Updated 5 years ago
- Reorder-based post-training quantization for large language model☆199May 17, 2023Updated 2 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆1,637Jul 12, 2024Updated last year
- A simple and effective LLM pruning approach.☆862Aug 9, 2024Updated last year
- Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.☆138Apr 28, 2022Updated 3 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆120Nov 17, 2023Updated 2 years ago
- Pruning Filter in Filter(NeurIPS2020)☆148Mar 3, 2024Updated 2 years ago
- ☆157Jun 22, 2023Updated 2 years ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆92Nov 23, 2022Updated 3 years ago
- To appear in the 30th International Joint Conference on Artificial Intelligence (IJCAI 2021).☆31Aug 18, 2021Updated 4 years ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated last year
- An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).☆278Jul 16, 2025Updated 9 months ago