neuralmagic / sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
☆377Updated 6 months ago
Alternatives and similar repositories for sparsezoo:
Users that are interested in sparsezoo are comparing it to the libraries listed below
- ML model optimization product to accelerate inference.☆322Updated 9 months ago
- Top-level directory for documentation and general content☆120Updated 2 months ago
- Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models☆2,096Updated 5 months ago
- Sparsity-aware deep learning inference runtime for CPUs☆3,088Updated 6 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆257Updated 3 months ago
- An open-source efficient deep learning framework/compiler, written in python.☆672Updated last week
- Prune a model while finetuning or training.☆398Updated 2 years ago
- Neural Network Compression Framework for enhanced OpenVINO™ inference☆968Updated this week
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight …☆235Updated last year
- A Python-level JIT compiler designed to make unmodified PyTorch programs faster.☆1,019Updated 9 months ago
- A research library for pytorch-based neural network pruning, compression, and more.☆160Updated 2 years ago
- End-to-end training of sparse deep neural networks with little-to-no performance loss.☆317Updated 2 years ago
- Accelerate PyTorch models with ONNX Runtime☆357Updated 4 months ago
- Library for 8-bit optimizers and quantization routines.☆717Updated 2 years ago
- Recipes are a standard, well supported set of blueprints for machine learning engineers to rapidly train models using the latest research…☆302Updated this week
- A library for researching neural networks compression and acceleration methods.☆139Updated 5 months ago
- Fast sparse deep learning on CPUs☆52Updated 2 years ago
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆312Updated this week
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆1,320Updated 6 months ago
- A library to analyze PyTorch traces.☆325Updated this week
- Curated list of awesome material on optimization techniques to make artificial intelligence faster and more efficient 🚀☆113Updated last year
- Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory☆431Updated 5 months ago
- A pytorch quantization backend for optimum☆870Updated 2 weeks ago
- FasterAI: Prune and Distill your models with FastAI and PyTorch☆246Updated last week
- ☆197Updated 3 years ago
- Actively maintained ONNX Optimizer☆662Updated this week
- PyTorch library to facilitate development and standardized evaluation of neural network pruning methods.☆427Updated last year
- A GPU performance profiling tool for PyTorch models☆500Updated 3 years ago
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆352Updated this week
- Common utilities for ONNX converters☆257Updated last month