ankiteciitkgp / bertTokenizer
A java implementation of Bert Tokenizer.
☆24Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for bertTokenizer
- This is a java version of Chinese tokenization descried in BERT.☆57Updated 2 years ago
- 百度UIE抽取模型torch版训练预测框架☆11Updated this week
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆46Updated last year
- 一个基于预训练的句向量生成工具☆132Updated last year
- ☆92Updated 6 months ago
- 一个用于训练句子embedding的工具,支持Cosent以及Simcse☆17Updated this week
- CLUEWSC2020: WSC Winograd模式挑战中文版,中文指代消解任务☆67Updated 4 years ago
- LORA 微调BLOOMZ,参考BELLE☆25Updated last year
- 大规模中文语料☆38Updated 5 years ago
- 高性能文本 Tokenizer 库☆27Updated 9 months ago
- Load Tensorflow pb file using Bert/TextCNNs, an ensemble model using Java.☆10Updated 3 years ago
- 离线端阅读理解应用 QA for mobile, Android & iPhone☆60Updated 2 years ago
- 评估自然语言的流畅度☆113Updated 3 years ago
- 对话改写介绍文章☆95Updated last year
- Build bert as a keras layer using TF2.0 .☆18Updated last year
- Chinese MobileBERT(中文MobileBERT模型)☆81Updated 2 years ago
- 中文版unilm预训练模型☆82Updated 3 years ago
- JDDC 2019 并列亚军(第三名)“网数ICT小分队”的检索模型部分☆42Updated last year
- The Corpus & Code for EMNLP 2022 paper "FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction" | FCGEC中文语法纠错语料及STG模型☆107Updated 3 months ago
- Correcting Chinese Spelling Errors with Phonetic Pre-training 非官方实现☆38Updated 2 years ago
- SIGHAN中文纠错数据集及转换后格式☆63Updated 4 years ago
- Introduction to CPM☆17Updated 3 years ago
- Pre-trained ERNIE models could be loaded with Keras for feature extraction and prediction.☆26Updated 2 years ago
- some demos of Knowledge Distillation in NLP☆19Updated 3 years ago
- DistilBERT for Chinese 海量中文预训练蒸馏bert模型☆90Updated 4 years ago
- 用于微调LLM的中文指令数据集☆27Updated last year
- “悟道”数据☆39Updated 3 years ago
- make LLM easier to use☆58Updated last year
- 中文NLP数据集☆151Updated 5 years ago
- FinCUGE Instruction dataset☆10Updated last year