A collection of research papers on efficient training of DNNs
☆69Jul 6, 2022Updated 3 years ago
Alternatives and similar repositories for Awesome-Efficient-Training
Users that are interested in Awesome-Efficient-Training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2021 Spotlight] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yinin…☆31Mar 2, 2024Updated 2 years ago
- ☆11Aug 2, 2024Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆25Jun 6, 2024Updated last year
- Implementation of "NITI: Training Integer Neural Networks Using Integer-only Arithmetic" on arxiv☆91Jul 26, 2022Updated 3 years ago
- Implementation of Hyena Hierarchy in JAX☆10Apr 30, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Training with Block Minifloat number representation☆18May 2, 2021Updated 5 years ago
- A novel FPGA-based intent recognition systemutilizing deep recurrent neural networks☆26Aug 25, 2021Updated 4 years ago
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 8 months ago
- ☆13Jul 3, 2025Updated 10 months ago
- Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design☆161Dec 18, 2020Updated 5 years ago
- Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.☆10Jan 24, 2019Updated 7 years ago
- Low Precision Arithmetic Simulation in PyTorch☆289May 20, 2024Updated 2 years ago
- This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".☆120Dec 12, 2021Updated 4 years ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆462May 15, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- super-resolution; post-training quantization; model compression☆14Nov 10, 2023Updated 2 years ago
- Combining SOAP and MUON☆22Feb 11, 2025Updated last year
- Benchmark PyTorch Custom Operators☆14Jul 6, 2023Updated 2 years ago
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training☆198Dec 22, 2022Updated 3 years ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆25Mar 29, 2024Updated 2 years ago
- ☆17Dec 19, 2024Updated last year
- Generate an FPGA design for a TWN☆11Nov 4, 2019Updated 6 years ago
- Paper list for accleration of transformers☆14Jul 1, 2023Updated 2 years ago
- This repository implements the paper "Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations"☆20Aug 30, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code example for the ICLR 2018 oral paper☆150May 31, 2018Updated 7 years ago
- Deep Learning Accelerator Based on Eyeriss V2 Architecture with custom RISC-V extended instructions☆208Jun 25, 2020Updated 5 years ago
- ☆11Apr 3, 2023Updated 3 years ago
- A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are co…☆2,383May 11, 2026Updated 2 weeks ago
- ☆54May 20, 2024Updated 2 years ago
- ☆14Nov 7, 2025Updated 6 months ago
- ☆65Oct 17, 2023Updated 2 years ago
- ☆19Dec 4, 2025Updated 5 months ago
- ☆224Feb 21, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆92Oct 22, 2024Updated last year
- [NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network☆74Nov 16, 2020Updated 5 years ago
- Code for the paper "Faster Neural Network Training with Approximate Tensor Operations"☆10Oct 23, 2021Updated 4 years ago
- The (open-source part of) code to reproduce "BPPSA: Scaling Back-propagation by Parallel Scan Algorithm".☆13Jun 7, 2021Updated 4 years ago
- [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design☆131Jun 27, 2023Updated 2 years ago
- ☆15Oct 26, 2022Updated 3 years ago
- [ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architecture…☆25Oct 1, 2022Updated 3 years ago