A collection of research papers on efficient training of DNNs
☆69Jul 6, 2022Updated 3 years ago
Alternatives and similar repositories for Awesome-Efficient-Training
Users that are interested in Awesome-Efficient-Training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2021 Spotlight] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yinin…☆31Mar 2, 2024Updated 2 years ago
- ☆11Aug 2, 2024Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Jun 6, 2024Updated last year
- Implementation of "NITI: Training Integer Neural Networks Using Integer-only Arithmetic" on arxiv☆90Jul 26, 2022Updated 3 years ago
- Implementation of Hyena Hierarchy in JAX☆10Apr 30, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆11Oct 11, 2023Updated 2 years ago
- Training with Block Minifloat number representation☆18May 2, 2021Updated 4 years ago
- Approximate layers - TensorFlow extension☆27Apr 14, 2025Updated 11 months ago
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 6 months ago
- ☆13Jul 3, 2025Updated 8 months ago
- Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design☆161Dec 18, 2020Updated 5 years ago
- Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.☆10Jan 24, 2019Updated 7 years ago
- Low Precision Arithmetic Simulation in PyTorch☆291May 20, 2024Updated last year
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆454May 15, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code and Results for Master Thesis Project on Fixed-point Quantization of Convolutional Neural Networks for Quantized Inference on Embedd…☆12Feb 7, 2021Updated 5 years ago
- Combining SOAP and MUON☆19Feb 11, 2025Updated last year
- Benchmark PyTorch Custom Operators☆14Jul 6, 2023Updated 2 years ago
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training☆199Dec 22, 2022Updated 3 years ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆25Mar 29, 2024Updated 2 years ago
- Generate an FPGA design for a TWN☆11Nov 4, 2019Updated 6 years ago
- Paper list for accleration of transformers☆14Jul 1, 2023Updated 2 years ago
- PyTorch Static Quantization Example☆41Apr 29, 2021Updated 4 years ago
- This repository implements the paper "Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations"☆20Aug 30, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment☆29May 15, 2024Updated last year
- Deep Learning Accelerator Based on Eyeriss V2 Architecture with custom RISC-V extended instructions☆207Jun 25, 2020Updated 5 years ago
- ☆11Apr 3, 2023Updated 2 years ago
- A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are co…☆2,334Jan 29, 2026Updated 2 months ago
- ☆54May 20, 2024Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- ☆14Nov 7, 2025Updated 4 months ago
- ☆64Oct 17, 2023Updated 2 years ago
- ☆19Dec 4, 2025Updated 3 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆223Feb 21, 2023Updated 3 years ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆91Oct 22, 2024Updated last year
- [NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network☆74Nov 16, 2020Updated 5 years ago
- Code for the paper "Faster Neural Network Training with Approximate Tensor Operations"☆10Oct 23, 2021Updated 4 years ago
- ColTraIn HBFP Training Emulator☆16Feb 16, 2023Updated 3 years ago
- The (open-source part of) code to reproduce "BPPSA: Scaling Back-propagation by Parallel Scan Algorithm".☆13Jun 7, 2021Updated 4 years ago