中文预处理语料
☆113Dec 18, 2018Updated 7 years ago
Alternatives and similar repositories for Chinese_from_dongxiexidian
Users that are interested in Chinese_from_dongxiexidian are comparing it to the libraries listed below
Sorting:
- CopyNet (Copy Mechanism in Seq2Seq) implementation with TensorFlow 2☆10Nov 21, 2022Updated 3 years ago
- An collection of Chinese nlp corpus including basic Chinese syntatic wordset, semantic wordset, historic corpus and evaluate corpus. 中文自然…☆449Dec 16, 2018Updated 7 years ago
- API_Translationg各大翻译网站API集合☆12Oct 20, 2018Updated 7 years ago
- 使用Bi-LSTM和crf来进行人名识别,数据集人民日报98年1月标注数据集,训练:验证:测试为3:1:1☆22Jul 25, 2018Updated 7 years ago
- datasets for NLP research☆24Nov 6, 2021Updated 4 years ago
- IdealWordCloudKit, A toolbox or kit for image-shape adjusted word cloud based on plain text, local file or web articles, 面向本地文件, 在线网页, 程序…☆41Jan 26, 2019Updated 7 years ago
- 自然语言处理相关实验实现 some experiment of natural language processing, Like text classification, named entity recognition, pos-tags, segment, key …☆54Nov 22, 2018Updated 7 years ago
- aliceCN☆14Jan 30, 2013Updated 13 years ago
- 带拼音、字形特征的文本纠错模型☆11Jan 1, 2023Updated 3 years ago
- 中文词库/词典,可用于NLP项目、分词等场景☆62Jun 15, 2022Updated 3 years ago
- This is a corpus of Chinese abbreviation, including negative full forms.☆199Jul 17, 2021Updated 4 years ago
- 文本去重☆78May 23, 2024Updated last year
- 手工整理的一批中医药术语词条☆13Mar 16, 2018Updated 8 years ago
- 公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。☆1,291Mar 27, 2024Updated last year
- hexo blog test☆14May 18, 2018Updated 7 years ago
- Plugin for Godot Engine to import GIF as AnimatedTexture☆16Dec 19, 2021Updated 4 years ago
- A lightweight CAPTCHA library☆10Feb 23, 2026Updated 3 weeks ago
- 大规模中文语料☆44Nov 5, 2019Updated 6 years ago
- ☆10Jan 28, 2021Updated 5 years ago
- 用bert4keras加载CDial-GPT☆38Nov 20, 2020Updated 5 years ago
- Chinese Classic Poem Mining Project including corpus buiding by spyder and content analysis by nlp methods, 基于爬虫与nlp的中国古代诗词文本挖掘项目☆120Oct 7, 2018Updated 7 years ago
- 暗网中文社区信息搜集☆13Jul 23, 2020Updated 5 years ago
- gps tool geohash 经纬度 围栏处理 常用的一些工具类服务接口封装☆13Aug 2, 2018Updated 7 years ago
- Byte Cup 2018国际机器学习竞赛 23 名(水滴队)代码☆47Feb 22, 2019Updated 7 years ago
- 该项目可以根据用户给出的上文自动生成下文 该项目是本人的本科毕业设计。项目主要基于GPT-2 Chinese实现。本人的工作主要是用新的语料库进行了几次训练,得出来了一个还凑合的模型。该项目已经初步完成,不再进行进一步的更新。☆12Jun 9, 2020Updated 5 years ago
- Storing Facebook friends data into Neo4j with Python and visualizing them using a D3.js graph☆24Mar 2, 2014Updated 12 years ago
- 中文公开聊天语料库☆4,174Apr 23, 2024Updated last year
- Software for unsupervised word segmentation and language model learning using lattices☆45Aug 17, 2016Updated 9 years ago
- Some useful Chinese corpus datasets 中文语料小数据☆546Mar 29, 2020Updated 5 years ago
- 使用word2vec, fasttext进行训练词向量☆11Jan 10, 2019Updated 7 years ago
- 2020-natural-language-processing-project☆10Dec 18, 2020Updated 5 years ago
- A demo of new approach to automatic text summarization using topic models and bipartite graphs.☆15Apr 23, 2013Updated 12 years ago
- ☆10Apr 17, 2019Updated 6 years ago
- 根据文本和角色名字典,生成人物关系文件,利用Gephi可生成网络图☆14Aug 25, 2019Updated 6 years ago
- self complement of Sentence Similarity compute based on cilin, hownet, simhash, wordvector,vsm models,基于同义词词林,知网,指纹,字词向量,向量空间模型的句子相似度计算。☆365Dec 15, 2018Updated 7 years ago
- Topic Model based on Pretrained Sentence Embeddings (with BERT)☆13Feb 8, 2023Updated 3 years ago
- 基于开源词语识别项目的高性能识别工具(可用于敏感词识别,关键词识别等)☆18Jun 17, 2022Updated 3 years ago
- 敏感词过滤的几种实现+某1w词敏感词库☆2,112Aug 20, 2021Updated 4 years ago
- 利用lstm和lstm/cnn进行答案问题匹配☆16Apr 21, 2018Updated 7 years ago