Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural network models (and their initializations) to make them easier to train.
☆80Jun 10, 2026Updated last week
Alternatives and similar repositories for dks
Users that are interested in dks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆31Feb 11, 2021Updated 5 years ago
- Pytorch optimizers implementing Hilbert Constrained Gradient Descent☆19May 9, 2019Updated 7 years ago
- Regularization, Neural Network Training Dynamics☆14Jan 13, 2020Updated 6 years ago
- Minimax Optimization, Stackelberg Games, Generative Adversarial Networks☆19Feb 14, 2020Updated 6 years ago
- Second Order Optimization and Curvature Estimation with K-FAC in JAX.☆324Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A simple Jax implementation of influence functions.☆21Apr 9, 2024Updated 2 years ago
- The ECMWF wave model ecWAM☆20Jun 9, 2026Updated last week
- ☆12Dec 7, 2017Updated 8 years ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- SMT-LIB benchmarks for shape computations from deep learning models in PyTorch☆18Dec 21, 2022Updated 3 years ago
- Code for "Tracing Knowledge in Language Models Back to the Training Data"☆40Dec 27, 2022Updated 3 years ago
- Computing gradients and Hessians of feed-forward networks with GPU acceleration☆20Feb 14, 2024Updated 2 years ago
- PyTorch-SSO: Scalable Second-Order methods in PyTorch☆150Oct 1, 2023Updated 2 years ago
- Minimal Implimentation of VCRec (2024) for collapse provention.☆18Jan 28, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Tiny Tutorial on https://arxiv.org/abs/1703.04730☆14Nov 19, 2019Updated 6 years ago
- JMP is a Mixed Precision library for JAX.☆214Jan 30, 2025Updated last year
- A lightweight library for tensorflow 2.0☆65Dec 3, 2019Updated 6 years ago
- Code for "Picking Winning Tickets Before Training by Preserving Gradient Flow" https://openreview.net/pdf?id=SkgsACVKPH☆105Feb 18, 2020Updated 6 years ago
- ☆166Dec 13, 2023Updated 2 years ago
- ☆33Jul 8, 2024Updated last year
- ☆34Sep 10, 2024Updated last year
- ☆19Updated this week
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Oct 17, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆23May 19, 2026Updated 3 weeks ago
- ☆13Apr 8, 2025Updated last year
- Implementation for the PHM paper at ICLR'21☆13Mar 1, 2023Updated 3 years ago
- ☆13Feb 24, 2020Updated 6 years ago
- [ICML 2023] Decentralized SGD and Average-direction SAM are Asymptotically Equivalent☆20Dec 4, 2023Updated 2 years ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- Understanding Short-Horizon Bias in Stochastic Meta-Optimization☆37Mar 8, 2018Updated 8 years ago
- Example of applying CUDA graphs to LLaMA-v2☆11Aug 25, 2023Updated 2 years ago
- ☆13Apr 30, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This repository is no longer maintained. Check☆81Apr 23, 2020Updated 6 years ago
- ☆24Feb 3, 2019Updated 7 years ago
- Actor Critic using Kronecker-Factored Trust Region☆19Jul 3, 2018Updated 7 years ago
- Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.☆19Jan 12, 2026Updated 5 months ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Feb 13, 2023Updated 3 years ago
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆15Aug 1, 2024Updated last year
- Latent Space Smoothing for Individually Fair Representations (ECCV 2022)☆15Nov 4, 2022Updated 3 years ago