adelmanm / approxLinks
Code for the paper "Faster Neural Network Training with Approximate Tensor Operations"
☆10Updated 3 years ago
Alternatives and similar repositories for approx
Users that are interested in approx are comparing it to the libraries listed below
Sorting:
- Block Sparse movement pruning☆79Updated 4 years ago
- MLPruning, PyTorch, NLP, BERT, Structured Pruning☆20Updated 3 years ago
- Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).☆58Updated 3 years ago
- Code release to reproduce ASHA experiments from "Random Search and Reproducibility for NAS."☆22Updated 5 years ago
- [JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion☆40Updated 4 years ago
- [NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Ya…☆140Updated 3 years ago
- Pytorch library for factorized L0-based pruning.☆45Updated last year
- PyTorch implementation of HashedNets☆36Updated 2 years ago
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆68Updated 3 years ago
- ICLR 2021☆48Updated 4 years ago
- Parameter Efficient Transfer Learning with Diff Pruning☆73Updated 4 years ago
- Official implementation of Neurips 2020 "Sparse Weight Activation Training" paper.☆27Updated 3 years ago
- This package implements THOR: Transformer with Stochastic Experts.☆63Updated 3 years ago
- [ICML 2021 Oral] "CATE: Computation-aware Neural Architecture Encoding with Transformers" by Shen Yan, Kaiqiang Song, Fei Liu, Mi Zhang☆19Updated 3 years ago
- [ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen…☆28Updated last year
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, …☆18Updated 3 years ago
- ☆41Updated 3 years ago
- [NeurIPS 2021] “Stronger NAS with Weaker Predictors“, Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang W…☆27Updated 2 years ago
- Best practices for testing advanced Mixtral, DeepSeek, and Qwen series MoE models using Megatron Core MoE.☆17Updated this week
- ☆22Updated 4 years ago
- Codebase for the paper "A Gradient Flow Framework for Analyzing Network Pruning"☆21Updated 4 years ago
- Code for Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot☆42Updated 4 years ago
- [NeurIPS 2022] DataMUX: Data Multiplexing for Neural Networks☆60Updated 2 years ago
- Code for ICML 2021 submission☆34Updated 4 years ago
- ☆33Updated 4 years ago
- Code accompanying the NeurIPS 2020 paper: WoodFisher (Singh & Alistarh, 2020)☆52Updated 4 years ago
- Single shot neural network pruning before training the model, based on connection sensitivity☆11Updated 5 years ago
- [ICLR 2022] "Sparsity Winning Twice: Better Robust Generalization from More Efficient Training" by Tianlong Chen*, Zhenyu Zhang*, Pengjun…☆39Updated 3 years ago
- ☆22Updated 2 years ago
- Survey for Model Compression☆23Updated 3 years ago