apple / ml-batchquantLinks
☆23Updated 3 years ago
Alternatives and similar repositories for ml-batchquant
Users that are interested in ml-batchquant are comparing it to the libraries listed below
Sorting:
- This repository contains the official implementation for the ECCV'22 paper, "SPIN: An Empirical Evaluation on Sharing Parameters of Isotr…☆20Updated 2 years ago
- Export utility for unconstrained channel pruned models☆72Updated 2 years ago
- ☆19Updated 4 years ago
- Self-Conditioning Pre-Trained Language Models, ICML 2022☆33Updated 3 years ago
- A light-weight implementation of ICCV2023 paper "Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Rei…☆83Updated 2 years ago
- ☆88Updated last year
- ☆42Updated 3 years ago
- Repository accompanying the Interspeech 2022 publication titled "Space-Efficient Representation of Entity-centric Query Language Models" …☆13Updated 3 years ago
- ☆23Updated 3 years ago
- Research publication code for "Least Squares Binary Quantization of Neural Networks"☆82Updated 2 years ago
- DUET: 2D Structured and Approximately Equivariant Representations, ICML 2023☆18Updated 2 years ago
- Flexible simulator for mixed precision and format simulation of LLMs and vision transformers.☆51Updated 2 years ago
- We have implemented a framework that supports developers to structured prune neural networks of Tensorflow Models☆28Updated last year
- Utility to test the performance of CoreML models.☆70Updated 5 years ago
- A block oriented training approach for inference time optimization.☆33Updated last year
- Evaluation Code repository for the paper "ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers". (2023…☆13Updated last year
- ptq4vm official repository☆22Updated 7 months ago
- Efficient in-memory representation for ONNX, in Python☆32Updated last week
- Prototype routines for GPU quantization written using PyTorch.☆21Updated 3 months ago
- AI Edge Quantizer: flexible post training quantization for LiteRT models.☆78Updated this week
- Repository for CPU Kernel Generation for LLM Inference☆27Updated 2 years ago
- ☆45Updated last year
- ACL 2023☆39Updated 2 years ago
- Memory Optimizations for Deep Learning (ICML 2023)☆110Updated last year
- PB-LLM: Partially Binarized Large Language Models☆156Updated 2 years ago
- torchvision-based transforms that provide access to parameterization☆15Updated 9 months ago
- Neural Architecture Search for Neural Network Libraries☆60Updated last year
- Dynamic Neural Architecture Search Toolkit☆31Updated 11 months ago
- Blazing fast training of 🤗 Transformers on Graphcore IPUs☆85Updated last year
- Official PyTorch implementation of "Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming" (ICML'23)☆13Updated last year