ShivamRajSharma / Transformer-Architectures-From-Scratch
Implementation of transformers based architecture in PyTorch.
☆54Updated 4 years ago
Alternatives and similar repositories for Transformer-Architectures-From-Scratch:
Users that are interested in Transformer-Architectures-From-Scratch are comparing it to the libraries listed below
- ☆49Updated 2 years ago
- Pytorch implementation of Transformer-TTS for converting text into speech.☆19Updated 3 years ago
- my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture☆130Updated 11 months ago
- A set of of fundamental operations and deep learning models using JAX☆13Updated 4 years ago
- Making CNNs interpretable.☆19Updated 3 years ago
- Mixture of experts on convolutional neural network using Keras and Cifar10☆28Updated 7 years ago
- Toy genetic algorithm in Pytorch☆39Updated this week
- Pytorch (Lightning) implementation of the Mamba model☆26Updated this week
- Unofficial PyTorch implementation of Google's FNet: Mixing Tokens with Fourier Transforms. With checkpoints.☆73Updated 2 years ago
- Tensorflow 2.x implementation of Vision-Transformer model☆19Updated 4 years ago
- Deep Learning Experiment Code.☆19Updated 8 months ago
- A complete implementation of the Pytorch neural network framework for GAN☆24Updated 3 years ago
- Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.☆76Updated 4 years ago
- Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"☆78Updated 2 weeks ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆71Updated last year
- PyTorch implementation of Teacher-Student Network(Knowledge Distillation).☆26Updated 3 years ago
- Kolmogorov-Arnold Networks with various basis functions like B-Splines, Fourier, Chebyshev, Wavelets etc☆33Updated 11 months ago
- Bayesian time series prediction☆65Updated 4 years ago
- several types of attention modules written in PyTorch for learning purposes☆50Updated 6 months ago
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆119Updated 3 weeks ago
- This is the code that went into our practical dive using mamba as information extraction☆54Updated last year
- Graph neural network message passing reframed as a Transformer with local attention☆68Updated 2 years ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆52Updated 3 weeks ago
- Shows how to do parameter ensembling using differential evolution.☆10Updated 3 years ago
- Efficient Deep Learning Survey Paper☆33Updated 2 years ago
- Unofficial Implementation of MLP-Mixer in TensorFlow☆27Updated 3 years ago
- FID computation in Jax/Flax.☆27Updated 9 months ago
- This repository contains a better implementation of Kolmogorov-Arnold networks☆61Updated 11 months ago
- Unofficial Implementation of Long-term Forecasting with TiDE: Time-series Dense Encoder☆53Updated last year
- Implementation of Agent Attention in Pytorch☆89Updated 9 months ago