Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
☆388Jun 2, 2025Updated 11 months ago
Alternatives and similar repositories for sparsezoo
Users that are interested in sparsezoo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Top-level directory for documentation and general content☆120Jun 2, 2025Updated 11 months ago
- ML model optimization product to accelerate inference.☆325Jun 2, 2025Updated 11 months ago
- Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models☆2,143Jun 2, 2025Updated 11 months ago
- Sparsity-aware deep learning inference runtime for CPUs☆3,161Jun 2, 2025Updated 11 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆266Dec 4, 2025Updated 5 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- McPAT modeling framework☆13Oct 18, 2014Updated 11 years ago
- Pytorch distributed backend extension with compression support☆17Mar 24, 2025Updated last year
- Code for "Training Adversarially Robust Sparse Networks via Bayesian Connectivity Sampling" [ICML 2021]☆10Mar 14, 2022Updated 4 years ago
- A model compression and acceleration toolbox based on pytorch.☆331Jan 12, 2024Updated 2 years ago
- Refine high-quality datasets and visual AI models☆10,717May 23, 2026Updated last week
- [TCAD 2021] Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA☆17Jul 7, 2022Updated 3 years ago
- The official PyTorch implementation of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024 paper Hyp²Nav:…☆18Oct 29, 2024Updated last year
- PyTorch implementation for the APoT quantization (ICLR 2020)☆287Dec 11, 2024Updated last year
- Run zero-shot prediction models on your data☆37Dec 19, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- We have implemented a framework that supports developers to structured prune neural networks of Tensorflow Models☆28Nov 7, 2024Updated last year
- ☆10Jul 27, 2020Updated 5 years ago
- Benchmark PyTorch Custom Operators☆14Jul 6, 2023Updated 2 years ago
- Repo for the Naive Bayesian Meetup Group☆11Nov 12, 2021Updated 4 years ago
- Code accompanying the NeurIPS 2020 paper: WoodFisher (Singh & Alistarh, 2020)☆53Mar 8, 2021Updated 5 years ago
- Quantize pytorch model, support post-training quantization and quantization aware training methods☆14Jun 15, 2023Updated 2 years ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆92Nov 23, 2022Updated 3 years ago
- 完成轻量化网络FastestDet的算法NCNN部署☆18Jul 7, 2022Updated 3 years ago
- Smol but mighty language model☆65Apr 4, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".☆883Aug 20, 2024Updated last year
- A curated list of plugins that you can add to your FiftyOne install!☆140May 14, 2026Updated 2 weeks ago
- ☆24Apr 20, 2024Updated 2 years ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆280Nov 3, 2023Updated 2 years ago
- ☆18Feb 7, 2024Updated 2 years ago
- Tools for simple inference testing using TensorRT, CUDA and OpenVINO CPU/GPU and CPU providers. Simple Inference Test for ONNX.☆24Sep 7, 2025Updated 8 months ago
- yolov5 pruning (SFP Pruning、Nework Slimming)☆19Oct 5, 2021Updated 4 years ago
- YOLOv5 in PyTorch > ONNX > CoreML > TFLite☆19Jun 4, 2025Updated 11 months ago
- MobileSAM のエンコーダー/デコーダーをONNXに変換し、推論するサンプル☆12Apr 11, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer☆30Dec 6, 2023Updated 2 years ago
- MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine (accepted as full paper at FPT'23)☆22Apr 17, 2024Updated 2 years ago
- Official implementation of "Searching for Winograd-aware Quantized Networks" (MLSys'20)☆27Oct 3, 2023Updated 2 years ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆39Mar 11, 2024Updated 2 years ago
- ☆10Aug 4, 2020Updated 5 years ago
- ☆97May 10, 2026Updated 2 weeks ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago