RangeKing / tuning_playbook_zh-CN
A playbook for systematically maximizing the performance of deep learning models.
☆25Updated 10 months ago
Alternatives and similar repositories for tuning_playbook_zh-CN:
Users that are interested in tuning_playbook_zh-CN are comparing it to the libraries listed below
- pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用☆98Updated last year
- ☆38Updated 2 months ago
- The official repo for CVPR2023 highlight paper "Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization".☆82Updated last year
- Yet another PyTorch Trainer and some core components for deep learning.☆217Updated 11 months ago
- A Tight-fisted Optimizer☆47Updated 2 years ago
- More light-weight pytorch experiment management library!☆65Updated 2 years ago
- The implementation of mixup and mainfold mixup method with standard models(PreActRes, WideRes, Dense) in Cifar10, Cifar100 and SVHN datas…☆45Updated 3 years ago
- deep learning template code☆65Updated 10 months ago
- State Space Models☆69Updated 11 months ago
- Lion and Adam optimization comparison☆61Updated 2 years ago
- About Code release for "Flowformer: Linearizing Transformers with Conservation Flows" (ICML 2022), https://arxiv.org/pdf/2202.06258.pdf☆320Updated 9 months ago
- ☆23Updated 2 years ago
- The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"☆20Updated 4 months ago
- Keras implement of Finite Scalar Quantization☆71Updated last year
- Implementation of Denoising Diffusion Probabilistic Model in MindSpore☆35Updated 2 years ago
- Awesome Colab Projects Collection☆26Updated last year
- A template for rapid deployment of PyTorch models.☆65Updated 2 years ago
- Moved to https://github.com/NUS-HPC-AI-Lab/InfoBatch☆6Updated last year
- ☆52Updated last year
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆43Updated 4 months ago
- [ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)☆121Updated last year
- ☆161Updated this week
- ☆189Updated last year
- A bag of tricks to speed up your deep learning process☆160Updated 11 months ago
- ☆48Updated last year
- 蜻蜓点论文 Think不Clear, 论文解读视频上传B站, youtube, 西瓜视频(同步到抖音)☆243Updated last year
- differentiable top-k operator☆21Updated 3 months ago
- Implementation of "Attention Is Off By One" by Evan Miller☆190Updated last year
- A torch-based implementation of K-Means and K-Means++☆17Updated 4 years ago
- Codes For Sharing☆39Updated 4 years ago