Rishit-dagli / GLULinks
An easy-to-use library for GLU (Gated Linear Units) and GLU variants in TensorFlow.
☆20Updated 2 years ago
Alternatives and similar repositories for GLU
Users that are interested in GLU are comparing it to the libraries listed below
Sorting:
- several types of attention modules written in PyTorch for learning purposes☆53Updated last month
- ☆13Updated last year
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆118Updated last week
- PyTorch implementation of moe, which stands for mixture of experts☆52Updated 5 years ago
- An implementation of Conformer: Convolution-augmented Transformer for Speech Recognition, a Transformer Variant in TensorFlow/Keras☆45Updated 4 years ago
- Outlining techniques for improving the training performance of your PyTorch model without compromising its accuracy☆128Updated 2 years ago
- Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow ✅, Pytorch 🔜, and Jax 🔜)☆53Updated 2 years ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆56Updated 3 months ago
- my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture☆134Updated last year
- Pytorch implementation of the xLSTM model by Beck et al. (2024)☆181Updated last year
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆213Updated 2 weeks ago
- Implementation of transformers based architecture in PyTorch.☆55Updated 5 years ago
- tinybig for deep function learning☆60Updated 8 months ago
- ☆58Updated last year
- Resources about xLSTM by Sepp Hochreiter☆318Updated last year
- CASPR is a deep learning framework applying transformer architecture to learn and predict from tabular data at scale.☆40Updated 3 years ago
- an implementation of FAdam (Fisher Adam) in PyTorch☆50Updated 7 months ago
- Efficient Python library for Extended LSTM with exponential gating, memory mixing, and matrix memory for superior sequence modeling.☆303Updated last year
- PyTorch implementation of Retentive Network: A Successor to Transformer for Large Language Models☆14Updated 2 years ago
- Playground for Transformers☆53Updated 2 years ago
- Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and PyTorch ports☆91Updated last year
- Build high-performance AI models with modular building blocks☆577Updated this week
- just collections about Llama2☆44Updated last year
- ☆132Updated 2 years ago
- Demonstrates knowledge distillation for image-based models in Keras.☆54Updated 4 years ago
- PyTorch and TensorFlow/Keras image models with automatic weight conversions and equal API/implementations - Vision Transformer (ViT), Res…☆41Updated 2 years ago
- Vision Transformers for image classification, image segmentation, and object detection.☆63Updated 3 months ago
- PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model …☆86Updated last year
- We'll look into audio categorization using deep learning principles like Artificial Neural Networks (ANN), 1D Convolutional Neural Networ…☆53Updated 3 years ago
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆40Updated last year