Transformer-XL with checkpoint loader
☆67Jan 22, 2022Updated 4 years ago
Alternatives and similar repositories for keras-transformer-xl
Users that are interested in keras-transformer-xl are comparing it to the libraries listed below
Sorting:
- Adaptive embedding and softmax☆17Jan 22, 2022Updated 4 years ago
- Gradient accumulation for Keras☆35Jun 27, 2021Updated 4 years ago
- Lookahead mechanism for optimizers in Keras.☆50Jun 24, 2021Updated 4 years ago
- Ordered Neurons LSTM☆30Jan 22, 2022Updated 4 years ago
- Implementation of XLNet that can load pretrained checkpoints☆169Jan 22, 2022Updated 4 years ago
- Transformer implemented in Keras☆369Jan 22, 2022Updated 4 years ago
- Load GPT-2 checkpoint and generate texts☆127Jan 22, 2022Updated 4 years ago
- ☆11Sep 3, 2021Updated 4 years ago
- A wrapper layer for stacking layers horizontally☆228Jan 22, 2022Updated 4 years ago
- SNAIL Attention Block for Keras.☆17Mar 30, 2020Updated 5 years ago
- Learning rate multiplier☆46Jun 22, 2021Updated 4 years ago
- Keras library for building (Universal) Transformers, facilitating BERT and GPT models☆541May 30, 2020Updated 5 years ago
- Layer normalization implemented in Keras☆60Jan 22, 2022Updated 4 years ago
- lookahead optimizer for keras☆169Oct 14, 2019Updated 6 years ago
- Bidirectional Attention Flow for Machine Comprehension implemented in Keras 2☆63Nov 21, 2022Updated 3 years ago
- Kashgari 框架的中文文档☆22Sep 11, 2020Updated 5 years ago
- Implementation of BERT that could load official pre-trained models for feature extraction and prediction☆2,425Jan 22, 2022Updated 4 years ago
- Attention mechanism for processing sequential data that considers the context for each timestamp.☆657Jan 22, 2022Updated 4 years ago
- Accelerate Transformers pipelines using ONNX Runtime.☆10Jun 5, 2020Updated 5 years ago
- 提取出判决书中的金额项和金额数。☆11Apr 8, 2016Updated 9 years ago
- An Attention Layer in Keras☆43Apr 23, 2019Updated 6 years ago
- Calculate similarity with embedding☆11Jan 22, 2022Updated 4 years ago
- RAdam implemented in Keras & TensorFlow☆324Jan 22, 2022Updated 4 years ago
- Neural Deconvolutions in Tensorflow☆12May 18, 2020Updated 5 years ago
- Implementation of Self-Governing Neural Networks for speech act classification☆12Nov 5, 2025Updated 4 months ago
- Implementation of Rectified Adam in Keras☆70Aug 24, 2019Updated 6 years ago
- A Keras TensorFlow 2.0 implementation of BERT, ALBERT and adapter-BERT.☆808Jan 13, 2023Updated 3 years ago
- Layer-wise Adaptive Moments optimizer for Batch training☆15Apr 3, 2019Updated 6 years ago
- Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includ…☆2,388Sep 3, 2024Updated last year
- Keras implementation of BERT with pre-trained weights☆815Jul 26, 2019Updated 6 years ago
- Using Spatial Transformer Layer with keras (theano backend).☆12Jun 7, 2016Updated 9 years ago
- AdaBound optimizer in Keras☆56Jul 11, 2020Updated 5 years ago
- 中文预训练XLNet模型: Pre-Trained Chinese XLNet_Large☆229Sep 13, 2019Updated 6 years ago
- ☆3,687Sep 21, 2022Updated 3 years ago
- Sampling Matters in Deep Embedding Learning (ICCV'17)☆16Oct 16, 2018Updated 7 years ago
- 将百度ernie的paddlepaddle模型转成tensorflow模型☆179Oct 12, 2019Updated 6 years ago
- Octave convolution☆34Jan 22, 2022Updated 4 years ago
- Position embedding layers in Keras☆58Jan 22, 2022Updated 4 years ago
- Graph convolutional layers☆62Jan 22, 2022Updated 4 years ago