A collection of research papers on efficient training of DNNs
☆69Jul 6, 2022Updated 3 years ago
Alternatives and similar repositories for Awesome-Efficient-Training
Users that are interested in Awesome-Efficient-Training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2021 Spotlight] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yinin…☆31Mar 2, 2024Updated 2 years ago
- ☆11Aug 2, 2024Updated last year
- Implementation of "NITI: Training Integer Neural Networks Using Integer-only Arithmetic" on arxiv☆91Jul 26, 2022Updated 3 years ago
- Implementation of Hyena Hierarchy in JAX☆10Apr 30, 2023Updated 2 years ago
- ☆14Apr 8, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆11Oct 11, 2023Updated 2 years ago
- Training with Block Minifloat number representation☆18May 2, 2021Updated 4 years ago
- A novel FPGA-based intent recognition systemutilizing deep recurrent neural networks☆27Aug 25, 2021Updated 4 years ago
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 7 months ago
- ☆13Jul 3, 2025Updated 9 months ago
- Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design☆161Dec 18, 2020Updated 5 years ago
- Low Precision Arithmetic Simulation in PyTorch☆290May 20, 2024Updated last year
- This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".☆120Dec 12, 2021Updated 4 years ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆460May 15, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- super-resolution; post-training quantization; model compression☆14Nov 10, 2023Updated 2 years ago
- Code and Results for Master Thesis Project on Fixed-point Quantization of Convolutional Neural Networks for Quantized Inference on Embedd…☆12Feb 7, 2021Updated 5 years ago
- Combining SOAP and MUON☆20Feb 11, 2025Updated last year
- Benchmark PyTorch Custom Operators☆14Jul 6, 2023Updated 2 years ago
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training☆198Dec 22, 2022Updated 3 years ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆25Mar 29, 2024Updated 2 years ago
- ☆17Dec 19, 2024Updated last year
- Generate an FPGA design for a TWN☆11Nov 4, 2019Updated 6 years ago
- Paper list for accleration of transformers☆14Jul 1, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- PyTorch Static Quantization Example☆41Apr 29, 2021Updated 4 years ago
- This repository implements the paper "Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations"☆20Aug 30, 2021Updated 4 years ago
- Code example for the ICLR 2018 oral paper☆150May 31, 2018Updated 7 years ago
- Create High-dynamic-range image☆10Jan 8, 2018Updated 8 years ago
- Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment☆29May 15, 2024Updated last year
- Deep Learning Accelerator Based on Eyeriss V2 Architecture with custom RISC-V extended instructions☆207Jun 25, 2020Updated 5 years ago
- ☆11Apr 3, 2023Updated 3 years ago
- A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are co…☆2,350Updated this week
- ☆54May 20, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated 2 years ago
- ☆64Oct 17, 2023Updated 2 years ago
- ☆19Dec 4, 2025Updated 4 months ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆91Oct 22, 2024Updated last year
- [NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network☆74Nov 16, 2020Updated 5 years ago
- ColTraIn HBFP Training Emulator☆16Feb 16, 2023Updated 3 years ago