pseeth / autoclip
Adaptive Gradient Clipping
☆117Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for autoclip
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆124Updated 2 years ago
- Implementation of Nyström Self-attention, from the paper Nyströmformer☆122Updated 9 months ago
- ☆164Updated last year
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆115Updated last year
- Implementation of Feedback Transformer in Pytorch☆104Updated 3 years ago
- Relative Positional Encoding for Transformers with Linear Complexity☆61Updated 2 years ago
- PyTorch implementation of Sinusodial Representation networks (SIREN)☆263Updated last year
- Traditional Machine Learning Models for Large-Scale Datasets in PyTorch.☆126Updated last week
- Collection of PyTorch Lightning implementations of Generative Adversarial Network varieties presented in research papers.☆167Updated 2 years ago
- Code repository of the paper "CKConv: Continuous Kernel Convolution For Sequential Data" published at ICLR 2022. https://arxiv.org/abs/21…☆117Updated last year
- ☆22Updated 2 weeks ago
- Inspired by "Neural Networks Fail to Learn Periodic Functions and How to Fix It"☆58Updated 5 months ago
- Drop-in replacement for any ResNet with a significantly reduced memory footprint and better representation capabilities☆208Updated 6 months ago
- Collection of the latest, greatest, deep learning optimizers (for Pytorch) - CNN, NLP suitable☆211Updated 3 years ago
- Seamless analysis of your PyTorch models (RAM usage, FLOPs, MACs, receptive field, etc.)☆208Updated this week
- Code repository of the paper "Wavelet Networks: Scale-Translation Equivariant Learning From Raw Time-Series, TMLR" https://arxiv.org/abs…☆80Updated 9 months ago
- custom cuda kernel for {2, 3}d relative attention with pytorch wrapper☆43Updated 4 years ago
- Code to accompany the paper "Hierarchical Quantized Autoencoders"☆37Updated last year
- PyTorch dataset extended with map, cache etc. (tensorflow.data like)☆328Updated 2 years ago
- ☆47Updated 3 years ago
- Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload☆124Updated 2 years ago
- ☆72Updated 3 years ago
- A minimal pytorch package implementing a gradient reversal layer.☆155Updated this week
- Online Normalization for Training Neural Networks (Companion Repository)☆79Updated 3 years ago
- Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch☆38Updated last year
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆207Updated last year
- Implementation of Linformer for Pytorch☆255Updated 10 months ago
- Implementation of Fast Transformer in Pytorch☆171Updated 3 years ago
- Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning☆155Updated 8 months ago
- Jax/Flax implementation of Variational-DiffWave.☆40Updated 2 years ago