soskek/attention_is_all_you_need

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/soskek/attention_is_all_you_need)

soskek / attention_is_all_you_need

Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.

☆323

Alternatives and similar repositories for attention_is_all_you_need

Users that are interested in attention_is_all_you_need are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

soskek / convolutional_seq2seq
View on GitHub
fairseq: Convolutional Sequence to Sequence Learning (Gehring et al. 2017) by Chainer
☆68Jun 15, 2017Updated 9 years ago
pfnet-research / chainer-graph-cnn
View on GitHub
Chainer implementation of 'Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering' (https://arxiv.org/abs/1606.09…
☆68Dec 28, 2017Updated 8 years ago
mitmul / tfchain
View on GitHub
Run a static part of the computational graph written in Chainer with Tensorflow
☆20Jan 10, 2017Updated 9 years ago
unnonouno / chainer-memnn
View on GitHub
Now it is exported as an official example
☆13Jan 24, 2018Updated 8 years ago
soskek / chainer-openai-transformer-lm
View on GitHub
A Chainer implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
☆28Jun 20, 2018Updated 8 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
soskek / bert-chainer
View on GitHub
Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
☆224Nov 9, 2019Updated 6 years ago
yasunorikudo / chainer-DenseNet
View on GitHub
Densely Connected Convolutional Network implementation by Chainer
☆39Jul 15, 2017Updated 9 years ago
chainer / chainermn
View on GitHub
ChainerMN: Scalable distributed deep learning with Chainer
☆206Apr 25, 2019Updated 7 years ago
paarthneekhara / byteNet-tensorflow
View on GitHub
ByteNet for character-level language modelling
☆317Aug 23, 2017Updated 8 years ago
Kyubyong / transformer
View on GitHub
A TensorFlow Implementation of the Transformer: Attention Is All You Need
☆4,472May 21, 2023Updated 3 years ago
jzilly / RecurrentHighwayNetworks
View on GitHub
Recurrent Highway Networks - Implementations for Tensorflow, Torch7, Theano and Brainstorm
☆402Oct 9, 2019Updated 6 years ago
yasunorikudo / chainer-ResDrop
View on GitHub
Deep Networks with Stochastic Depth implementation by Chainer
☆40Apr 11, 2016Updated 10 years ago
kimhc6028 / forward-thinking-pytorch
View on GitHub
Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"
☆65Jun 14, 2017Updated 9 years ago
neubig / lxmls-2017
View on GitHub
Slides/code for the Lisbon machine learning school 2017
☆28Jul 27, 2017Updated 9 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
TatsuyaShirakawa / poincare-embedding
View on GitHub
Poincaré Embedding (unofficial)
☆229May 7, 2019Updated 7 years ago
chainer / onnx-chainer
View on GitHub
Add-on package for ONNX format support in Chainer
☆86Nov 6, 2019Updated 6 years ago
koreyou / gat-chainer
View on GitHub
Unofficial Chainer implementation of Graph Attention Networks (GAT)
☆18Jan 24, 2019Updated 7 years ago
jiweil / Neural-Dialogue-Generation
View on GitHub
☆833Jul 12, 2017Updated 9 years ago
jsuarez5341 / Recurrent-Highway-Hypernetworks-NIPS
View on GitHub
Cleaned original source code from my NIPS publication
☆158Dec 4, 2017Updated 8 years ago
asappresearch / sru
View on GitHub
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
☆2,107Jan 4, 2022Updated 4 years ago
odashi / chainer_examples
View on GitHub
Example usages of Chainer for natural language processing.
☆117Nov 30, 2016Updated 9 years ago
pfnet-research / chainer-gan-lib
View on GitHub
Chainer implementation of recent GAN variants
☆411Mar 24, 2023Updated 3 years ago
facebookresearch / fairseq-lua
View on GitHub
Facebook AI Research Sequence-to-Sequence Toolkit
☆3,725Sep 17, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jiamings / cramer-gan
View on GitHub
Tensorflow Implementation on "The Cramer Distance as a Solution to Biased Wasserstein Gradients" (https://arxiv.org/pdf/1705.10743.pdf)
☆123Dec 10, 2017Updated 8 years ago
neubig / nmt-tips
View on GitHub
A tutorial about neural machine translation including tips on building practical systems
☆369Nov 16, 2016Updated 9 years ago
aonotas / neural-figures
View on GitHub
Neural Networks Figures
☆52May 30, 2017Updated 9 years ago
leetenki / YOLOv2
View on GitHub
YOLOv2のchainerの再現実装です(darknetのchainerローダと、完全なchainer上での訓練コードを含みます)
☆340Sep 26, 2022Updated 3 years ago
nyu-dl / dl4mt-c2c
View on GitHub
☆143Jul 16, 2017Updated 9 years ago
primitiv / primitiv
View on GitHub
A Neural Network Toolkit.
☆176Dec 19, 2019Updated 6 years ago
neulab / dynet-benchmark
View on GitHub
Benchmarks for DyNet
☆55Sep 22, 2025Updated 10 months ago
khanhptnk / seq2seq-chainer
View on GitHub
☆11Mar 11, 2018Updated 8 years ago
jingweiz / pytorch-dnc
View on GitHub
Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom
☆279Feb 20, 2018Updated 8 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
chainer / chainerui
View on GitHub
ChainerUI: User Interface for Chainer
☆171Jan 7, 2023Updated 3 years ago
mlpnlp / mlpnlp-nmt
View on GitHub
This is a sample code of "LSTM encoder-decoder with attention mechanism" mainly for understanding a recently developed machine translatio…
☆44Mar 14, 2019Updated 7 years ago
soskek / variational_dropout_sparsifies_dnn
View on GitHub
Variational Dropout Sparsifies Deep Neural Networks (Molchanov et al. 2017) by Chainer
☆18Jun 22, 2017Updated 9 years ago
mitmul / ofChainer
View on GitHub
☆11Apr 11, 2019Updated 7 years ago
musyoku / chainer-nn
View on GitHub
☆10Oct 16, 2017Updated 8 years ago
nyu-dl / dl4ir-searchQA
View on GitHub
☆181Aug 17, 2018Updated 7 years ago
chainer / chainer
View on GitHub
A flexible framework of neural networks for deep learning
☆5,922Aug 28, 2023Updated 2 years ago