A collection of research papers on efficient training of DNNs
☆69Jul 6, 2022Updated 3 years ago
Alternatives and similar repositories for Awesome-Efficient-Training
Users that are interested in Awesome-Efficient-Training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2021 Spotlight] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yinin…☆31Mar 2, 2024Updated 2 years ago
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆25Jun 6, 2024Updated 2 years ago
- Implementation of "NITI: Training Integer Neural Networks Using Integer-only Arithmetic" on arxiv☆92Jul 26, 2022Updated 3 years ago
- Implementation of Hyena Hierarchy in JAX☆10Apr 30, 2023Updated 3 years ago
- ☆14Apr 8, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆11Oct 11, 2023Updated 2 years ago
- Training with Block Minifloat number representation☆18May 2, 2021Updated 5 years ago
- Approximate layers - TensorFlow extension☆27Apr 14, 2025Updated last year
- A novel FPGA-based intent recognition systemutilizing deep recurrent neural networks☆26Aug 25, 2021Updated 4 years ago
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 9 months ago
- ☆13Jul 3, 2025Updated 11 months ago
- Neural Network Quantization & Low-Bit Fixed Point Training For Hardware-Friendly Algorithm Design☆161Dec 18, 2020Updated 5 years ago
- Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.☆10Jan 24, 2019Updated 7 years ago
- Low Precision Arithmetic Simulation in PyTorch☆289May 20, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".☆120Dec 12, 2021Updated 4 years ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆461May 15, 2023Updated 3 years ago
- super-resolution; post-training quantization; model compression☆14Nov 10, 2023Updated 2 years ago
- Combining SOAP and MUON☆22Feb 11, 2025Updated last year
- Benchmark PyTorch Custom Operators☆14Jul 6, 2023Updated 2 years ago
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training☆199Dec 22, 2022Updated 3 years ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆25Mar 29, 2024Updated 2 years ago
- ☆17Dec 19, 2024Updated last year
- Generate an FPGA design for a TWN☆11Nov 4, 2019Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Paper list for accleration of transformers☆14Jul 1, 2023Updated 2 years ago
- PyTorch Static Quantization Example☆41Apr 29, 2021Updated 5 years ago
- This repository implements the paper "Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations"☆20Aug 30, 2021Updated 4 years ago
- Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment☆29May 15, 2024Updated 2 years ago
- Deep Learning Accelerator Based on Eyeriss V2 Architecture with custom RISC-V extended instructions☆210Jun 25, 2020Updated 5 years ago
- ☆11Apr 3, 2023Updated 3 years ago
- A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are co…☆2,395May 11, 2026Updated last month
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated 2 years ago
- ☆14Nov 7, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆65Oct 17, 2023Updated 2 years ago
- ☆19Dec 4, 2025Updated 6 months ago
- ☆225Feb 21, 2023Updated 3 years ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆91Oct 22, 2024Updated last year
- ColTraIn HBFP Training Emulator☆15Feb 16, 2023Updated 3 years ago
- Code for the paper "Faster Neural Network Training with Approximate Tensor Operations"☆10Oct 23, 2021Updated 4 years ago