bojone / keras_lazyoptimizerLinks
Keras implement of Lazy optimizer
☆21Updated 5 years ago
Alternatives and similar repositories for keras_lazyoptimizer
Users that are interested in keras_lazyoptimizer are comparing it to the libraries listed below
Sorting:
- RAdam optimizer for keras☆71Updated 5 years ago
- 高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task☆60Updated 5 years ago
- machine reading comprehension with deep learning☆20Updated 7 years ago
- ☆18Updated 6 years ago
- bert-of-theseus via bert4keras☆31Updated 5 years ago
- Adversarial Training for NLP in Keras☆46Updated 5 years ago
- pytorch版bert权重转tf☆22Updated 5 years ago
- Dilation Gate CNN For Machine Reading Comprehension☆17Updated 2 years ago
- AI Challenger 2018 阅读理解赛道代码分享☆21Updated 6 years ago
- ai challenge 2018 's final code.☆16Updated 6 years ago
- Python下shuffle几百G文件☆33Updated 4 years ago
- bert4keras实现gpt下中国象棋☆46Updated 4 years ago
- Source code for "Training Generative Adversarial Networks Via Turing Test".☆13Updated 5 years ago
- 高质量闲聊数据介绍☆30Updated 6 years ago
- wrapping a keras optimizer to implement gradient accumulation☆119Updated 5 years ago
- Kaggle新赛(baseline)-基于BERT的fine-tuning方案+基于tensor2tensor的Transformer Encoder方案☆61Updated 6 years ago
- 无监督文本生成的一些方法☆49Updated 4 years ago
- saving memory by recomputing for keras☆37Updated 5 years ago
- 2019达观杯实体识别☆19Updated 6 years ago
- ESIM model with lanuage model☆27Updated 6 years ago
- Implemented transformer NN block for Machine translation, text classfication, Natural language inference as well as Machine reading compr…☆11Updated last year
- Transformer-XL with checkpoint loader☆68Updated 3 years ago
- An Open-source Neural Hierarchical Multi-label Text Classification Toolkit☆78Updated 6 years ago
- Adaptive embedding and softmax☆17Updated 3 years ago
- BERT Extension in TensorFlow☆30Updated 6 years ago
- ☆61Updated 5 years ago
- XLNet: Generalized Autoregressive Pretraining for Language Understanding 论文的中文翻译 Paper Chinese Translation!☆49Updated 5 years ago
- some strategies for exposure bias in seq2seq☆18Updated 5 years ago
- a beautiful method for cluster or community detection☆51Updated 5 years ago
- 中文 预训练 ELECTRA 模型: 基于对抗学习 pretrain Chinese Model☆141Updated 5 years ago