汉字字符特征提取工具,可以提取出字符中的字音(声母、韵母、声调)、字形(偏旁、部首)、四角编码等特征,同时可作为tensor输入到模型
☆138May 25, 2020Updated 5 years ago
Alternatives and similar repositories for char_featurizer
Users that are interested in char_featurizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 汉字字符特征提取器 (featurizer),提取汉字的特征(发音特征、字形特征)用做深度学习的特征 | A Chinese character feature extractor, which extracts the features of Chinese charac…☆298Dec 29, 2025Updated 3 months ago
- An open-access corpus of conversational bilingual speech in Cantonese and English☆40Apr 28, 2022Updated 3 years ago
- 汉字拆字库,可以将汉字拆解成偏旁部首,在机器学习中作为汉字的字形特征 | Hanzi Decomposition Library allows Chinese characters to be broken down into radicals and components…☆414Dec 29, 2025Updated 3 months ago
- 漢語拆字字典☆811Jan 8, 2023Updated 3 years ago
- 对常用的6700个汉字进行音、形比较,输出音近字、形近字的列表。 # 相近字☆482Mar 28, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- CCKS 2020: 面向中文短文本的实体链指任务☆43Mar 27, 2021Updated 5 years ago
- CCKS 2019 中文短文本实体链指比赛技术创新奖解决方案☆410Mar 24, 2023Updated 3 years ago
- AbstractKnowledgeGraph, a systematic knowledge graph that concentrate on abstract thing including abstract entity and action. 抽象知识图谱,目前规模…☆248Aug 6, 2019Updated 6 years ago
- 基于“音形码”的中文字符串相似度计算方法☆227Jul 24, 2020Updated 5 years ago
- This is a corpus of Chinese abbreviation, including negative full forms.☆199Jul 17, 2021Updated 4 years ago
- 汉字笔画库☆87Jan 8, 2021Updated 5 years ago
- NLP相关的paper代码复现。主要包括ACL,AAAI,EMNLP等顶会论文。☆88Aug 13, 2022Updated 3 years ago
- Modify Chinese text, modified on LaserTagger Model. 文本复述,基于lasertagger做中文文本数据增强。☆322Jan 3, 2024Updated 2 years ago
- Chinese Embedding collection incling token ,postag ,pinyin,dependency,word embedding.中文自然语言处理向量合集,包括字向量,拼音向量,词向量,词性向量,依存关系向量.共5种类型的向量☆454Dec 15, 2018Updated 7 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- 研究所有汉字的结构,为NLP中汉字结构问题提供完备的解。☆19Apr 7, 2024Updated last year
- 嵌套命名实体识别 Nested NER☆20Nov 14, 2021Updated 4 years ago
- self complemented SpellCorrection based pinyin similairity, edit distance ,基于拼音相似度与编辑距离的查询纠错。☆84May 20, 2022Updated 3 years ago
- 2020阿里云天池大数据竞赛-中医药命名实体识别挑战赛☆27Nov 7, 2020Updated 5 years ago
- datagrand 2019 information extraction competition rank9☆130Dec 29, 2019Updated 6 years ago
- CCKS2019-人物关系抽取☆74Jun 2, 2019Updated 6 years ago
- An annotated Chinese dataset for RE (Relation Extraction) task.☆14Oct 18, 2018Updated 7 years ago
- 基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口☆1,293Jun 13, 2021Updated 4 years ago
- Dataset and Baseline for SMP-MCC2020☆23Jul 6, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- CLUEWSC2020: WSC Winograd模式挑战中文版,中文指代消解任务☆79May 24, 2020Updated 5 years ago
- An experimental desktop client for using Claude Desktop's MCP with Novelcrafter codices.☆10Dec 3, 2024Updated last year
- Hello world demonstration for Weblate☆14Jan 20, 2026Updated 2 months ago
- 基于检索的任务型多轮对话☆78Oct 11, 2020Updated 5 years ago
- 中文文本摘要/关键词提取☆436Dec 28, 2020Updated 5 years ago
- ☆101Oct 10, 2020Updated 5 years ago
- BDCI 2018 汽车行业用户观点主题及情感识别 决赛一等奖方案☆431Dec 7, 2018Updated 7 years ago
- NLP NER datasets video/music/book bio☆90Jan 3, 2021Updated 5 years ago
- 基于python3训练中文wiki词向量、字向量、拼音向量☆11Jan 2, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- a bert for retrieval and generation☆859Feb 26, 2021Updated 5 years ago
- ☆21Apr 23, 2019Updated 6 years ago
- An off-the-shelf tool for Chinese Keyphrase Extraction 一个快速从中文里抽取关键短语的工具,仅占35M内存 www.jionlp.com☆555Nov 21, 2023Updated 2 years ago
- CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition☆1,523Nov 21, 2022Updated 3 years ago
- Useful collection of webrat Textmate snippets meant for use with the RSpec Story and/or Cucumber bundles☆79Aug 7, 2009Updated 16 years ago
- 机器检索阅读联合学习,莱斯杯:全国第二届“军事智能机器阅读”挑战赛 rank6 方案☆128Oct 20, 2020Updated 5 years ago
- Consider is a parser for the ThinkGear protocol used by NeuroSky devices (MindSet, BrainBand and others).☆16Apr 3, 2012Updated 13 years ago