bojone / adafactorLinks
adafactor optimizer for keras
☆20Updated 3 years ago
Alternatives and similar repositories for adafactor
Users that are interested in adafactor are comparing it to the libraries listed below
Sorting:
- Adversarial Training for NLP in Keras☆46Updated 5 years ago
- saving memory by recomputing for keras☆37Updated 5 years ago
- 高校赛2019 文本点击预测☆42Updated 5 years ago
- bert-of-theseus via bert4keras☆31Updated 5 years ago
- 2019中国高校计算机大赛——大数据挑战赛 第一名解决方案☆42Updated 4 years ago
- 高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task☆59Updated 5 years ago
- tensorflow version of bert-of-theseus☆62Updated 4 years ago
- RAdam optimizer for keras☆71Updated 5 years ago
- This is our solution for KDD Cup 2020. We implemented a very neat and simple neural ranking model based on siamese BERT which ranked firs…☆71Updated 5 years ago
- This is our solution for WSDM - DiggSci 2020. We implemented a simple yet robust search pipeline which ranked 2nd in the validation set a…☆63Updated 5 years ago
- Keras implement of Lazy optimizer☆21Updated 5 years ago
- ESIM model with lanuage model☆27Updated 6 years ago
- XLNet: Generalized Autoregressive Pretraining for Language Understanding 论文的中文翻译 Paper Chinese Translation!☆49Updated 5 years ago
- pytorch版bert权重转tf☆21Updated 5 years ago
- A pytorch implementation of Attention is all you need☆92Updated 6 years ago
- 中文 预训练 ELECTRA 模型: 基于对抗学习 pretrain Chinese Model☆141Updated 5 years ago
- 2019达观杯实体识别☆19Updated 5 years ago
- 基于capsule的观点型阅读理解模型☆89Updated 6 years ago
- 用bert4keras来解小学数学应用题☆77Updated 4 years ago
- 24*2个预训练的小型BERT模型,NLPer炼丹利器☆50Updated 5 years ago
- 对ACL2020 FastBERT论文的复现,论文地址//arxiv.org/pdf/2004.02178.pdf☆193Updated 3 years ago
- bert4keras实现gpt下中国象棋☆44Updated 4 years ago
- 大数据应用分类标注挑战赛(NLP),亚军🥈☆20Updated 2 years ago
- Natural Language Procesing☆34Updated 4 years ago
- Using Keras + Tensor Flow to Implement Model Transformer in Paper "Attention Is All You Need". 使用 keras+tensorflow 实现论文"Attention Is All …☆34Updated 6 years ago
- ☆59Updated 5 years ago
- ☆22Updated 7 years ago
- wrapping a keras optimizer to implement gradient accumulation☆119Updated 4 years ago
- keras sparse implement of margin-softmax☆100Updated 7 years ago
- 中文生成式预训练模型☆98Updated 4 years ago