gaussic / Chinese-Lyric-CorpusLinks
A Chinese lyric corpus which contains nearly 50,000 lyrics from 500 artists
☆37Updated 7 years ago
Alternatives and similar repositories for Chinese-Lyric-Corpus
Users that are interested in Chinese-Lyric-Corpus are comparing it to the libraries listed below
Sorting:
- Code for ACL 2020 paper "Rigid Formats Controlled Text Generation":https://www.aclweb.org/anthology/2020.acl-main.68/☆236Updated 4 years ago
- python | 高效使用统计语言模型kenlm:新词发现、分词、智能纠错等☆164Updated 5 years ago
- ☆125Updated 4 years ago
- kenlm语言模型,并提供python的rest服务☆29Updated 6 years ago
- ☆101Updated 4 years ago
- Chinese GPT2: pre-training and fine-tuning framework for text generation☆187Updated 4 years ago
- Use bert to predict punctuation on IWSLT2012 and The People's Daily 2014☆66Updated 5 years ago
- chatbot based on music region using method including es and music kb.基于14W歌曲知识库的问答尝试,功能包括歌词接龙,已知歌词找歌曲以及歌曲歌手歌词三角关系的问答。☆273Updated 6 years ago
- 中文谐音词/字库(同音词/字)Chinese Homophones☆108Updated 5 years ago
- ☆173Updated 2 years ago
- Chinese Transformer Generative Pre-Training Model☆59Updated 5 years ago
- lasertagger-chinese;lasertagger中文学习案例,案例数据,注释,shell运行☆75Updated 2 years ago
- ☆218Updated 2 years ago
- 基于Pytorch 1.0 实现的中文断句与标点符号恢复。☆58Updated 6 years ago
- ☆36Updated 6 years ago
- 各大中文分词性能评测☆158Updated 6 years ago
- DistilBERT for Chinese 海量中文预训练蒸馏bert模型☆92Updated 5 years ago
- BERT-CCPoem is an BERT-based pre-trained model particularly for Chinese classical poetry☆155Updated 3 years ago
- SpellGCN☆252Updated 4 years ago
- The code for our ACL2022 findings paper: CRACSpell: A Contextual Typo Robust Approach with Copy Mechanism to Improve Chinese Spelling Cor…☆75Updated 3 years ago
- 拼音转汉字, convert pinyin to 汉字 using deep networks☆22Updated 4 years ago
- ☆34Updated 3 years ago
- 中文生成式预训练模型☆98Updated 4 years ago
- Pytorch model for https://github.com/imcaspar/gpt2-ml☆79Updated 3 years ago
- Poetry-related datasets developed by THUAIPoet (Jiuge) group.☆229Updated 5 years ago
- Chinese Couplets Dataset without vulgar words. 不包含敏感内容的对联数据集。☆75Updated 5 years ago
- 汉字字符特征提取工具,可以提取出字符中的字音(声母、韵母、声调)、字形(偏旁、部首)、四角编码等特征,同时可作为tensor输入到模型☆137Updated 5 years ago
- 基于mlm方式的带有纠错功能的拼音转汉字bert预训练模型,pinyin correcter,基于pytorch框架实现☆45Updated 4 years ago
- 基于LSTM语言模型和seq2seq序列模型的歌词生成,包括数据爬取、数据处理、模型训练和歌词生成。☆70Updated 5 years ago
- Modify Chinese text, modified on LaserTagger Model. I name it "文本手术刀".目前,本项目实现了一个文本复述任务,用于NLP语料的数据增强。☆214Updated 2 years ago