Transformer implementation in PyTorch.
☆492Mar 7, 2019Updated 7 years ago
Alternatives and similar repositories for transformer-pytorch
Users that are interested in transformer-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- tunz's CUDA pytorch operator (MaskedSoftmax)☆75Mar 7, 2019Updated 7 years ago
- A PyTorch implementation of the Transformer model in "Attention is All You Need".☆9,690Apr 16, 2024Updated 2 years ago
- Fine-tuned KoGPT2 chatbot demo with translated PersonaChat (ongoing)☆13Apr 17, 2022Updated 4 years ago
- Transformer: PyTorch Implementation of "Attention Is All You Need"☆4,524Jul 15, 2025Updated 9 months ago
- Transformer seq2seq model, program that can build a language translator from parallel corpus☆1,427May 19, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Implementation of unregularized, l1 regularized and l2 regularized linear regression using numpy and without sklearn☆12Oct 4, 2019Updated 6 years ago
- TensorFlow implementation of (Momentum) Stochastic Variance-Adapted Gradient.☆45May 11, 2018Updated 7 years ago
- A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"☆578Oct 1, 2020Updated 5 years ago
- An annotated implementation of the Transformer paper.☆7,193Apr 7, 2024Updated 2 years ago
- pytorch implementation of Attention is all you need☆240Jun 16, 2021Updated 4 years ago
- code for Question Condensing Networks for Answer Selection in Community Question Answering☆14Aug 26, 2018Updated 7 years ago
- Tutorial for pretraining Korean GPT-2 model☆67Jun 12, 2023Updated 2 years ago
- A PyTorch implementation of Transformer in "Attention is All You Need"☆106Dec 6, 2020Updated 5 years ago
- ☆13Jul 31, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Py…☆25,071Updated this week
- Materials for "Natural Language Processing for Multilingual Task-Oriented Dialogue" Tutorial at ACL 2022☆14May 21, 2022Updated 3 years ago
- ☆22Dec 31, 2019Updated 6 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- ICCV23 "Householder Projector for Unsupervised Latent Semantics Discovery"☆17Jun 26, 2025Updated 9 months ago
- Source code for our paper "Pessimistic Decision-Making for Recommender Systems" published at ACM TORS, and RecSys 2021.☆11Dec 15, 2022Updated 3 years ago
- Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.☆17,183Jun 2, 2023Updated 2 years ago
- The Transformer in PyTorch☆13Aug 7, 2024Updated last year
- Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.☆5,689Jan 20, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Google AI 2018 BERT pytorch implementation☆6,530Sep 15, 2023Updated 2 years ago
- 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal model…☆159,742Updated this week
- ☆10Mar 28, 2022Updated 4 years ago
- ☆12,471Mar 3, 2026Updated last month
- PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"☆271Feb 12, 2022Updated 4 years ago
- PlaNet: Learning Latent Dynamics for Planning from Pixels☆10Feb 13, 2020Updated 6 years ago
- ☆25Jan 2, 2024Updated 2 years ago
- ☆14Aug 3, 2021Updated 4 years ago
- A C++/CUDA toolkit for Transformer (NMT) Translator (Decoder)☆17Jan 7, 2019Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".☆44Oct 29, 2021Updated 4 years ago
- Posterior Control of Blackbox Generation☆23May 2, 2020Updated 5 years ago
- ☆13Jun 1, 2017Updated 8 years ago
- Fine-tune BERT to generate sentence embedding for cosine similarity☆69Aug 12, 2019Updated 6 years ago
- Simple Chit-Chat based on KoGPT2☆183Jun 12, 2023Updated 2 years ago
- The PyTorch code for paper: An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss☆12Oct 7, 2019Updated 6 years ago
- The baseline system for the ICASSP2024 ICMC-ASR Challenge.☆55Dec 6, 2023Updated 2 years ago