pseeth / autoclip
Adaptive Gradient Clipping
☆124Updated 2 years ago
Alternatives and similar repositories for autoclip:
Users that are interested in autoclip are comparing it to the libraries listed below
- ☆164Updated last year
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆124Updated 2 years ago
- Code repository of the paper "CKConv: Continuous Kernel Convolution For Sequential Data" published at ICLR 2022. https://arxiv.org/abs/21…☆119Updated 2 years ago
- Implementation of Nyström Self-attention, from the paper Nyströmformer☆124Updated 11 months ago
- Relative Positional Encoding for Transformers with Linear Complexity☆61Updated 2 years ago
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆115Updated 2 years ago
- Drop-in replacement for any ResNet with a significantly reduced memory footprint and better representation capabilities☆209Updated 8 months ago
- ☆22Updated 2 months ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆210Updated last year
- Implementation of Feedback Transformer in Pytorch☆105Updated 3 years ago
- Code repository of the paper "Wavelet Networks: Scale-Translation Equivariant Learning From Raw Time-Series, TMLR" https://arxiv.org/abs…