neuralmagic / docs
Top-level directory for documentation and general content
☆121Updated 3 months ago
Alternatives and similar repositories for docs:
Users that are interested in docs are comparing it to the libraries listed below
- Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes☆382Updated 8 months ago
- ML model optimization product to accelerate inference.☆326Updated 11 months ago
- Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models☆2,118Updated 7 months ago
- Sparsity-aware deep learning inference runtime for CPUs☆3,117Updated 8 months ago
- A research library for pytorch-based neural network pruning, compression, and more.☆160Updated 2 years ago
- A Python-level JIT compiler designed to make unmodified PyTorch programs faster.☆1,037Updated 11 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆262Updated 5 months ago
- Recipes are a standard, well supported set of blueprints for machine learning engineers to rapidly train models using the latest research…☆310Updated this week
- End-to-end training of sparse deep neural networks with little-to-no performance loss.☆319Updated 2 years ago
- A library for researching neural networks compression and acceleration methods.☆141Updated 6 months ago
- SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX R…☆2,355Updated this week
- Implementation of a Transformer, but completely in Triton☆260Updated 2 years ago
- An open-source efficient deep learning framework/compiler, written in python.☆691Updated 3 weeks ago
- Fast sparse deep learning on CPUs☆52Updated 2 years ago
- A library to analyze PyTorch traces.☆348Updated last week
- PyTorch library to facilitate development and standardized evaluation of neural network pruning methods.☆428Updated last year
- Prune a model while finetuning or training.☆400Updated 2 years ago
- A GPU performance profiling tool for PyTorch models☆505Updated 3 years ago
- Library for 8-bit optimizers and quantization routines.☆717Updated 2 years ago
- Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)☆485Updated last year
- A collection of metrics to profile a single deep learning model or compare two different deep learning models☆26Updated last year
- Block-sparse primitives for PyTorch☆154Updated 3 years ago
- PyTorch interface for the IPU☆177Updated last year
- A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.☆780Updated this week
- TF2 implementation of knowledge distillation using the "function matching" hypothesis from https://arxiv.org/abs/2106.05237.☆87Updated 3 years ago
- Neural Architecture Search (NAS) papers with code☆157Updated 3 years ago
- [ICLR 2020] Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks☆137Updated 4 years ago
- Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.☆474Updated 4 months ago
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight …☆235Updated last year
- ☆202Updated 2 years ago