A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity
☆47Feb 7, 2020Updated 6 years ago
Alternatives and similar repositories for why-clipping-accelerates
Users that are interested in why-clipping-accelerates are comparing it to the libraries listed below
Sorting:
- Official codebase for our paper "Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks"☆12Jun 30, 2021Updated 4 years ago
- ☆69Mar 4, 2020Updated 6 years ago
- ☆12Jul 7, 2021Updated 4 years ago
- [ECCV18] Constraint-Aware Deep Neural Network Compression☆12Sep 11, 2018Updated 7 years ago
- An empirical investigation of deep learning theory☆16Oct 3, 2019Updated 6 years ago
- Implementation of the paper "Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory", Ron Amit and Ron Meir, ICML 2018☆18Apr 13, 2021Updated 4 years ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Jun 11, 2025Updated 8 months ago
- batchboost is a variation on MixUp that instead of mixing just two images, mixes many images together.☆44Jan 26, 2020Updated 6 years ago
- Source codes for our AAAI'20 paper: Adaptive Factorization Network: Learning Adaptive-Order Feature Interactions☆38Sep 22, 2020Updated 5 years ago
- ☆14May 7, 2019Updated 6 years ago
- ☆20Oct 3, 2019Updated 6 years ago
- Code for paper "Adversarial Support Alignment"☆23Apr 22, 2022Updated 3 years ago
- 基于BERT的预训练语言模型实现,分为两步:预训练和微调。目前已包括BERT、Roberta、ALbert三个模型,且皆可支持Whole Word Mask模式。☆17Feb 1, 2020Updated 6 years ago
- Code of "Visualizing and Understanding Object Detecor"☆20Jun 24, 2021Updated 4 years ago
- ☆16Sep 4, 2018Updated 7 years ago
- Deep Learning using Rectified Linear Units (ReLU)☆23Aug 2, 2024Updated last year
- ☆25Jun 11, 2023Updated 2 years ago
- A Unified, Systematic Framework of Structured Weight Pruning for DNNs☆22Aug 3, 2018Updated 7 years ago
- Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."☆25Oct 7, 2020Updated 5 years ago
- [CVPR2019] Using semantic information to help self-supervised monocular depth prediction☆21May 6, 2019Updated 6 years ago
- Active Learning for Improved Semi Supervised Semantic Segmentation in Satellite Images☆23Mar 4, 2022Updated 4 years ago
- [CVPR2019] NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction☆56Jan 7, 2020Updated 6 years ago
- Official PyTorch implementation of "Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets" (ICLR 2023 notable top 25%)☆26Mar 18, 2024Updated last year
- Cheap distillation for convolutional neural networks.☆35Oct 22, 2018Updated 7 years ago
- Related material on Federated Learning☆26Apr 9, 2020Updated 5 years ago
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆27Nov 7, 2019Updated 6 years ago
- Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"☆23Mar 4, 2020Updated 6 years ago
- This is the official repo for the experiments in the paper "Bilevel Programming for Hyperparameter Optimization and Meta-Learning"☆29Jun 7, 2018Updated 7 years ago
- SparseMax activation function implementation (ICML 2016) (PyTorch)☆28Nov 30, 2017Updated 8 years ago
- A PyTorch implementation of various Online & Stochastic optimization algorithms for deep learning☆27Feb 4, 2022Updated 4 years ago
- This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …☆33Jun 14, 2023Updated 2 years ago
- Pytorch implementation of our paper accepted by IEEE TNNLS, 2022 -- Distilling a Powerful Student Model via Online Knowledge Distillation☆31Nov 11, 2021Updated 4 years ago
- ☆27Feb 10, 2022Updated 4 years ago
- Welcome to NeuCo-Bench, a benchmarking framework for evaluating compressed embeddings on downstream tasks.☆25Feb 24, 2026Updated 2 weeks ago
- Code for BlockSwap (ICLR 2020).☆33Mar 25, 2021Updated 4 years ago
- Implementation "Adapting Auxiliary Losses Using Gradient Similarity" article☆33Mar 1, 2019Updated 7 years ago
- openapi of all third-party☆10Updated this week
- Code for reproducing work of ICML 2019 paper: Memory-Optimal Direct Convolutions for Maximizing Classification Accuracy in Embedded Appli…☆12Jun 8, 2019Updated 6 years ago
- Style Transfer by Rigid Alignment in Neural Net Feature Space☆11Jan 23, 2021Updated 5 years ago