RuslanKhalitov / ChordMixer
The official implementation of the ChordMixer architecture.
☆61Updated last year
Alternatives and similar repositories for ChordMixer:
Users that are interested in ChordMixer are comparing it to the libraries listed below
- Official code for Long Expressive Memory (ICLR 2022, Spotlight)☆69Updated 3 years ago
- Compression schema for gradients of activations in backward pass☆44Updated last year
- Deep Learning Audio Course – AI Masters☆29Updated this week
- FusionBrain Challenge 2.0: creating multimodal multitask model☆16Updated 2 years ago
- Deep Generative Models course, 2021☆22Updated 3 years ago
- Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"☆26Updated last month
- ☆20Updated 8 months ago
- Relative Positional Encoding for Transformers with Linear Complexity☆62Updated 2 years ago
- ☆18Updated 3 months ago
- FID computation in Jax/Flax.☆27Updated 8 months ago
- ☆26Updated 3 years ago
- ☆36Updated last year
- RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).☆32Updated 2 years ago
- Very simple and short implementation of gradient boosting in 18 lines of code☆9Updated 4 years ago
- Framework for writing deep learning training loops. Lightweight, and retaining full freedom to design as you see fits. It handles checkpo…☆108Updated last year
- Learning to Initialize Neural Networks for Stable and Efficient Training☆138Updated 2 years ago
- Lightweight knowledge distillation pipeline☆28Updated 3 years ago
- ☆71Updated 7 months ago
- GULAG: GUessing LAnGuages with neural networks☆13Updated 2 years ago
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)☆14Updated 4 months ago
- ☆13Updated 4 months ago
- ☆31Updated 3 years ago
- Simple audio AE☆12Updated 4 months ago
- AdaCat☆49Updated 2 years ago
- Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"☆28Updated 4 years ago
- Code for the paper "PALBERT: Teaching ALBERT to Ponder", NeurIPS 2022 Spotlight☆37Updated last year
- Layerwise Batch Entropy Regularization☆22Updated 2 years ago
- Code for MSID, a Multi-Scale Intrinsic Distance for comparing generative models, studying neural networks, and more!☆51Updated 5 years ago
- T5-based (russian) text normalization☆20Updated last year
- Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper☆80Updated 3 years ago