ShivamRajSharma / Transformer-Architectures-From-Scratch
Implementation of transformers based architecture in PyTorch.
☆52Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for Transformer-Architectures-From-Scratch
- PyTorch implementation of Teacher-Student Network(Knowledge Distillation).☆23Updated 3 years ago
- ☆47Updated 2 years ago
- The Objective is to implement Siamese Network for Face Recognition using Pytorch Lightning.☆17Updated 3 years ago
- code for the ddp tutorial☆32Updated 2 years ago
- several types of attention modules written in PyTorch for learning purposes☆40Updated last month
- Playground for Transformers☆42Updated 11 months ago
- my codes for learning attention mechanism☆50Updated 4 years ago
- Keras 1D Depthwise Convolutional layer☆10Updated 4 years ago
- Official pytorch code for "APP: Anytime Progressive Pruning" (DyNN @ ICML, 2022; CLL @ ACML, 2022, SNN @ ICML, 2022 and SlowDNN 2023)☆17Updated last year
- Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"☆60Updated last week
- Recurrent neural networks: building a custom LSTM/GRU cell in PyTorch☆28Updated 4 years ago
- MLPNAS code for Paperspace series on Neural Architecture Search☆22Updated last year
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆50Updated last week
- Efficient Deep Learning Survey Paper☆32Updated last year
- Convolutional Neural Network implemented from Scratch for MNIST and CIFAR-10 datasets.☆55Updated 2 years ago
- Examples of using PyTorch hooks, as covered in my YouTube tutorial video.☆32Updated last year
- ☆24Updated 2 years ago
- Implementation of Agent Attention in Pytorch☆86Updated 4 months ago
- Various Kaggle image classification challenges solutions☆41Updated last month
- Exploring Moving Mnist dataset with forecasting algorithms☆31Updated last year
- Unofficial Implementation of MLP-Mixer in TensorFlow☆26Updated 3 years ago
- This repository hosts the code to port NumPy model weights of BiT-ResNets to TensorFlow SavedModel format.☆14Updated 2 years ago
- Includes additional materials for the following keras.io blog post.☆12Updated 3 years ago
- A set of of fundamental operations and deep learning models using JAX☆13Updated 3 years ago
- LoRA fine-tuned Stable Diffusion Deployment☆31Updated last year
- sharpDARTS: Faster and More Accurate Differentiable Architecture Search☆16Updated 3 years ago
- PyTorch reimplementation of the paper "HyperMixer: An MLP-based Green AI Alternative to Transformers" [arXiv 2022].☆17Updated 2 years ago
- A Keras implementation of hybrid efficientnet swin transformer model.☆33Updated last year
- Quantization of LLMs and benchmarking.☆10Updated 7 months ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆66Updated last year