ML model optimization product to accelerate inference.
☆326Jun 2, 2025Updated 9 months ago
Alternatives and similar repositories for sparsify
Users that are interested in sparsify are comparing it to the libraries listed below
Sorting:
- Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes☆387Jun 2, 2025Updated 9 months ago
- Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models☆2,144Jun 2, 2025Updated 9 months ago
- Top-level directory for documentation and general content☆120Jun 2, 2025Updated 9 months ago
- Sparsity-aware deep learning inference runtime for CPUs☆3,163Jun 2, 2025Updated 9 months ago
- ONNX model visualizer☆88Jun 28, 2023Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Dec 4, 2025Updated 3 months ago
- A set of simple tools for splitting, merging, OP deletion, size compression, rewriting attributes and constants, OP generation, change op…☆303Apr 22, 2024Updated last year
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- ☆13Feb 10, 2026Updated 3 weeks ago
- Simple script to re-rank images using OpenAI's CLIP https://github.com/openai/CLIP.☆15May 3, 2021Updated 4 years ago
- Simple tool to change the INPUT and OUTPUT shape of ONNX.☆15Apr 1, 2025Updated 11 months ago
- SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, …☆2,592Updated this week
- ☆25Sep 19, 2025Updated 5 months ago
- Tutorial on how to convert machine learned models into ONNX☆15Mar 11, 2023Updated 2 years ago
- Compare Savant and PyTorch performance☆13Feb 9, 2024Updated 2 years ago
- A collection of optimizers, some arcane others well known, for Flax.☆29Aug 6, 2021Updated 4 years ago
- Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains☆1,731Oct 8, 2023Updated 2 years ago
- Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX.☆18Oct 8, 2025Updated 4 months ago
- Using Rust with Python☆18Aug 19, 2023Updated 2 years ago
- A model compression and acceleration toolbox based on pytorch.☆333Jan 12, 2024Updated 2 years ago
- Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀☆1,688Oct 23, 2024Updated last year
- Open Source Compiler Framework using ONNX as Frontend and IR☆33Aug 17, 2022Updated 3 years ago
- A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB,…☆17Feb 24, 2026Updated last week
- Multi-class probabilistic classification using inductive and cross Venn–Abers predictors☆50Jun 22, 2022Updated 3 years ago
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…☆4,706Feb 27, 2026Updated last week
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆37Oct 18, 2023Updated 2 years ago
- PyTorch Lightning + Hydra. + Timm: A very user-friendly template for rapid and reproducible MLOps with best practices. ⚡🔥⚡☆17Mar 3, 2023Updated 3 years ago
- Implementations of growing and pruning in neural networks☆22Jul 26, 2023Updated 2 years ago
- ☆21Mar 18, 2021Updated 4 years ago
- A general and accurate MACs / FLOPs profiler for PyTorch models☆635Jul 29, 2025Updated 7 months ago
- Code for generating the JuICe dataset.☆37Oct 27, 2021Updated 4 years ago
- AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.☆2,565Updated this week
- Accessible large language models via k-bit quantization for PyTorch.☆8,019Updated this week
- Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".☆872Aug 20, 2024Updated last year
- implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)☆2,009Nov 3, 2024Updated last year
- Efficient few-shot learning with Sentence Transformers☆2,688Dec 11, 2025Updated 2 months ago
- My journey during 10 weeks of building FiftyOne plugins☆22Nov 12, 2023Updated 2 years ago
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆10,356Feb 20, 2026Updated 2 weeks ago
- ONNX Optimizer☆798Updated this week