[NeurIPS 2023] The PyTorch Implementation of Scheduled (Stable) Weight Decay.
☆61Feb 3, 2024Updated 2 years ago
Alternatives and similar repositories for stable-weight-decay-regularization
Users that are interested in stable-weight-decay-regularization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICML 2021] The official PyTorch Implementations of Positive-Negative Momentum Optimizers.☆28Aug 30, 2022Updated 3 years ago
- [Neural Computation, MIT Press] The PyTorch Implementation of Variable Optimizers/ Neural Variable Risk Minimization proposed in our Neur…☆33Aug 3, 2021Updated 4 years ago
- [ICML 2022, Oral] The PyTorch Implementation of Adaptive Inertia Methods. The algorithms are based on our paper: "Adaptive Inertia: Dise…☆151Feb 17, 2023Updated 3 years ago
- This is a list of peer-reviewed representative papers on deep learning dynamics (optimization dynamics of neural networks). The success o…☆298Apr 10, 2024Updated 2 years ago
- AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning. AAAI, 2023.☆30Sep 29, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- Official Code for ICLR2022 Paper: Chaos is a Ladder: A New Theoretical Understanding of Contrastive Learning via Augmentation Overlap☆28Sep 28, 2025Updated 7 months ago
- notes for NJU courses☆18Oct 26, 2021Updated 4 years ago
- ☆20Jan 5, 2025Updated last year
- A 20M RWKV v6 can do nonogram☆13Oct 18, 2024Updated last year
- ☆14Dec 20, 2022Updated 3 years ago
- Unofficial Pytorch implementation of the paper Filter Response Normalization.☆19Dec 9, 2019Updated 6 years ago
- IROS☆18Aug 10, 2025Updated 8 months ago
- ☆13May 2, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation☆19Nov 28, 2022Updated 3 years ago
- Unofficial Implementation of Null-text Inversion (https://arxiv.org/abs/2211.09794)☆12Nov 20, 2022Updated 3 years ago
- An AI Agent using MoonBit☆13Nov 29, 2024Updated last year
- An implementation of the Anthropic's paper and essay on "A statistical approach to model evaluations"☆16Oct 6, 2025Updated 7 months ago
- Python implementation for paper: Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples☆11Jun 12, 2018Updated 7 years ago
- ☆12Jul 15, 2020Updated 5 years ago
- Unsupervised Domain Adaptation on Graphs☆15Apr 6, 2022Updated 4 years ago
- [Re] Can gradient clipping mitigate label noise? (ML Reproducibility Challenge 2020)☆14Sep 3, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for CDC2022 submission☆11Sep 1, 2025Updated 8 months ago
- I used to drink coffee a lot. But oolong has become my favorite recently. Hope you like it too.🫖☆12Sep 13, 2021Updated 4 years ago
- This framework implements key experiments on the sparse double descent phenomenon (ICML 2022).☆15Dec 13, 2022Updated 3 years ago
- An object detection codebase based on MegEngine.☆28Dec 14, 2022Updated 3 years ago
- Code release for "Dropout Reduces Underfitting"☆316May 6, 2023Updated 3 years ago
- PyTorch code for our paper "Binarized Dual Residual Network for 3D Whole-body Human Mesh Recovery"☆15Dec 2, 2023Updated 2 years ago
- Crawl4DeepSeek = Crawl4AI + DeepSeek 🚀 Smart, efficient, and built for deep web exploration! 🌐🤖☆18Feb 9, 2025Updated last year
- SPATL: Salient Prameter Aggregation and Transfer Learning for Heterogeneous Federated Learning☆24Nov 17, 2022Updated 3 years ago
- Official code for the paper "FairerCLIP: Debiasing CLIP’s Zero-Shot Predictions using Functions in RKHSs".☆16Oct 14, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Performs tasks together with GPT.☆13Apr 4, 2023Updated 3 years ago
- ☆20Mar 7, 2018Updated 8 years ago
- The implement of "Learning Disentangled Semantic Representation for Domain Adaptation" (IJCAI 2019)☆20Aug 26, 2019Updated 6 years ago
- [ICML 2022 Spotlight] Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks☆11May 21, 2023Updated 2 years ago
- Demo for METSC: A microstructure estimation Transformer inspired by sparse representation for diffusion MRI (MedIA 2023).☆12Nov 13, 2023Updated 2 years ago
- Official Code for ICLR 2023 Paper: A Message Passing Perspective on Learning Dynamics of Contrastive Learning☆11Mar 9, 2023Updated 3 years ago
- using pvanet framework train mobilenet-v2 for objects detection, papaer: https://arxiv.org/abs/1611.08588☆13Feb 13, 2019Updated 7 years ago