bojone / adafactorLinks
adafactor optimizer for keras
☆20Updated 4 years ago
Alternatives and similar repositories for adafactor
Users that are interested in adafactor are comparing it to the libraries listed below
Sorting:
- Adversarial Training for NLP in Keras☆46Updated 5 years ago
- saving memory by recomputing for keras☆37Updated 5 years ago
- bert-of-theseus via bert4keras☆31Updated 5 years ago
- RankNet算法介绍☆42Updated 6 years ago
- 2019中国高校计算机大赛——大数据挑战赛 第一名解决方案☆42Updated 4 years ago
- 高校赛2019 文本点击预测☆43Updated 5 years ago
- XLNet: Generalized Autoregressive Pretraining for Language Understanding 论文的中文翻译 Paper Chinese Translation!☆49Updated 5 years ago
- tensorflow version of bert-of-theseus☆63Updated 4 years ago
- 高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task☆60Updated 5 years ago
- ESIM model with lanuage model☆27Updated 6 years ago
- 对ACL2020 FastBERT论文的复现,论文地址//arxiv.org/pdf/2004.02178.pdf☆194Updated 3 years ago
- This is our solution for WSDM - DiggSci 2020. We implemented a simple yet robust search pipeline which ranked 2nd in the validation set a…☆63Updated 5 years ago
- 中文 预训练 ELECTRA 模型: 基于对抗学习 pretrain Chinese Model☆141Updated 5 years ago
- This is our solution for KDD Cup 2020. We implemented a very neat and simple neural ranking model based on siamese BERT which ranked firs…☆71Updated 5 years ago
- 24*2个预训练的小型BERT模型,NLPer炼丹利器☆51Updated 5 years ago
- Keras implement of Lazy optimizer☆21Updated 5 years ago
- ☆61Updated 5 years ago
- RAdam optimizer for keras☆71Updated 5 years ago
- CCF BDCI 2019 “技术需求”与“技术成果”项目之间关联度计算模型 复赛B榜top1解决方案☆77Updated 2 years ago
- 天池人工智能创新赛3-ch12hu团队周星星分享☆27Updated 4 years ago
- wrapping a keras optimizer to implement gradient accumulation☆119Updated 5 years ago
- 本项目主要为针对DPCNN(Deep Pyramid Convolutional Neural Networks for Text Categorization )文本分类(Text Classification)的论文复现以及基于知乎看山杯Inception的修改和复现,…☆143Updated 6 years ago
- A pytorch implementation of Attention is all you need☆92Updated 6 years ago
- 基于capsule的观点型阅读理解模型☆89Updated 6 years ago
- Dilation Gate CNN For Machine Reading Comprehension☆17Updated 2 years ago
- 目前只有阅读理解赛道的☆14Updated 4 years ago
- ☆44Updated 6 years ago
- Using Keras + Tensor Flow to Implement Model Transformer in Paper "Attention Is All You Need". 使用 keras+tensorflow 实现论文"Attention Is All …☆34Updated 6 years ago
- Worth-reading papers and related resources on attention mechanism, Transformer and pretrained language model (PLM) such as BERT. 值得一读的注意力…☆130Updated 4 years ago
- Knowledge Distillation from BERT☆53Updated 6 years ago