neuralmagic / sparsezoo
Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
☆382Updated 9 months ago
Alternatives and similar repositories for sparsezoo:
Users that are interested in sparsezoo are comparing it to the libraries listed below
- ML model optimization product to accelerate inference.☆326Updated last year
- Top-level directory for documentation and general content☆120Updated 4 months ago
- Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models☆2,124Updated 8 months ago
- Sparsity-aware deep learning inference runtime for CPUs☆3,133Updated 9 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆262Updated 6 months ago
- An open-source efficient deep learning framework/compiler, written in python.☆698Updated last month
- Blazing fast training of 🤗 Transformers on Graphcore IPUs☆85Updated last year
- A Python-level JIT compiler designed to make unmodified PyTorch programs faster.☆1,040Updated last year
- ☆295Updated last week
- Fast sparse deep learning on CPUs☆53Updated 2 years ago
- Library for 8-bit optimizers and quantization routines.☆716Updated 2 years ago
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight …☆235Updated last year
- Prune a model while finetuning or training.☆402Updated 2 years ago
- ☆251Updated 8 months ago
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆386Updated this week
- Implementation of a Transformer, but completely in Triton☆263Updated 3 years ago
- Accelerate PyTorch models with ONNX Runtime☆359Updated last month
- A research library for pytorch-based neural network pruning, compression, and more.☆160Updated 2 years ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆179Updated last year
- PyTorch interface for the IPU☆179Updated last year
- Training material for IPU users: tutorials, feature examples, simple applications☆86Updated 2 years ago
- This repository contains the experimental PyTorch native float8 training UX☆223Updated 8 months ago
- Neural Network Compression Framework for enhanced OpenVINO™ inference☆998Updated this week
- ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.☆337Updated this week
- Fast low-bit matmul kernels in Triton☆288Updated this week
- onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime☆373Updated last week
- A repository for log-time feedforward networks☆221Updated last year
- Fast Block Sparse Matrices for Pytorch☆545Updated 4 years ago
- Lite Inference Toolkit (LIT) for PyTorch☆161Updated 3 years ago
- A code generator from ONNX to PyTorch code☆136Updated 2 years ago