charlesXu86/char_featurizer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/charlesXu86/char_featurizer)

charlesXu86 / char_featurizer

汉字字符特征提取工具，可以提取出字符中的字音（声母、韵母、声调）、字形（偏旁、部首）、四角编码等特征，同时可作为tensor输入到模型

☆138

Alternatives and similar repositories for char_featurizer

Users that are interested in char_featurizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

howl-anderson / hanzi_char_featurizer
View on GitHub
汉字字符特征提取器 (featurizer)，提取汉字的特征（发音特征、字形特征）用做深度学习的特征｜ A Chinese character feature extractor, which extracts the features of Chinese charac…
☆301Dec 29, 2025Updated 7 months ago
khiajohnson / SpiCE-Corpus
View on GitHub
An open-access corpus of conversational bilingual speech in Cantonese and English
☆40Apr 28, 2022Updated 4 years ago
howl-anderson / hanzi_chaizi
View on GitHub
汉字拆字库，可以将汉字拆解成偏旁部首，在机器学习中作为汉字的字形特征 | Hanzi Decomposition Library allows Chinese characters to be broken down into radicals and components…
☆423Dec 29, 2025Updated 7 months ago
kfcd / chaizi
View on GitHub
漢語拆字字典
☆817Jan 8, 2023Updated 3 years ago
contr4l / SimilarCharacter
View on GitHub
对常用的6700个汉字进行音、形比较，输出音近字、形近字的列表。 # 相近字
☆482Mar 28, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RegiusQuant / CCKS2020-Entity-Linking
View on GitHub
CCKS 2020: 面向中文短文本的实体链指任务
☆43Mar 27, 2021Updated 5 years ago
AlexYangLi / ccks2019_el
View on GitHub
CCKS 2019 中文短文本实体链指比赛技术创新奖解决方案
☆411Mar 24, 2023Updated 3 years ago
qingyujean / ssc
View on GitHub
基于“音形码”的中文字符串相似度计算方法
☆225Jul 24, 2020Updated 6 years ago
liuhuanyong / AbstractKnowledgeGraph
View on GitHub
AbstractKnowledgeGraph, a systematic knowledge graph that concentrate on abstract thing including abstract entity and action. 抽象知识图谱，目前规模…
☆248Aug 6, 2019Updated 6 years ago
zhangyics / Chinese-abbreviation-dataset
View on GitHub
This is a corpus of Chinese abbreviation, including negative full forms.
☆198Jul 17, 2021Updated 5 years ago
tongchangD / text_data_enhancement_with_LaserTagger
View on GitHub
Modify Chinese text, modified on LaserTagger Model. 文本复述，基于lasertagger做中文文本数据增强。
☆320Jan 3, 2024Updated 2 years ago
charlesXu86 / PAPER-In-CODE
View on GitHub
NLP相关的paper代码复现。主要包括ACL，AAAI，EMNLP等顶会论文。
☆89Aug 13, 2022Updated 3 years ago
liuhuanyong / ChineseEmbedding
View on GitHub
Chinese Embedding collection incling token ,postag ,pinyin,dependency,word embedding.中文自然语言处理向量合集,包括字向量,拼音向量,词向量,词性向量,依存关系向量.共5种类型的向量
☆455Dec 15, 2018Updated 7 years ago
WTree / chineseStroke
View on GitHub
汉字笔画库
☆86Jan 8, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
open-chinese / chinese-word-structure
View on GitHub
研究所有汉字的结构，为NLP中汉字结构问题提供完备的解。
☆19Apr 7, 2024Updated 2 years ago
liuhuanyong / QueryCorrection
View on GitHub
self complemented SpellCorrection based pinyin similairity, edit distance ，基于拼音相似度与编辑距离的查询纠错。
☆83May 20, 2022Updated 4 years ago
menghuanlater / Tianchi2020ChineseMedicineNER
View on GitHub
2020阿里云天池大数据竞赛-中医药命名实体识别挑战赛
☆27Nov 7, 2020Updated 5 years ago
xuyouqian / Bert-Ner-Demo
View on GitHub
嵌套命名实体识别 Nested NER
☆19Nov 14, 2021Updated 4 years ago
lonePatient / daguan_2019_rank9
View on GitHub
datagrand 2019 information extraction competition rank9
☆130Dec 29, 2019Updated 6 years ago
shiningliang / CCKS2019-IPRE
View on GitHub
CCKS2019-人物关系抽取
☆74Jun 2, 2019Updated 7 years ago
celtics7 / BaiduBaike
View on GitHub
An annotated Chinese dataset for RE (Relation Extraction) task.
☆15Oct 18, 2018Updated 7 years ago
charlesXu86 / Chatbot_CN
View on GitHub
基于金融-司法领域(兼有闲聊性质)的聊天机器人，其中的主要模块有信息抽取、NLU、NLG、知识图谱等，并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
☆1,291Jun 13, 2021Updated 5 years ago
iseesaw / SMP-MCC2020
View on GitHub
Dataset and Baseline for SMP-MCC2020
☆23Jul 6, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
CLUEbenchmark / CLUEWSC2020
View on GitHub
CLUEWSC2020: WSC Winograd模式挑战中文版，中文指代消解任务
☆80May 24, 2020Updated 6 years ago
charlesXu86 / Chatbot_Retrieval
View on GitHub
基于检索的任务型多轮对话
☆78Oct 11, 2020Updated 5 years ago
baijiangliang / year2018
View on GitHub
Annual report for programmers.
☆21Jan 3, 2019Updated 7 years ago
deadshot465 / novelcrafter-mcp
View on GitHub
An experimental desktop client for using Claude Desktop's MCP with Novelcrafter codices.
☆11Dec 3, 2024Updated last year
ArtistScript / FastTextRank
View on GitHub
中文文本摘要/关键词提取
☆436Dec 28, 2020Updated 5 years ago
bojone / nezha_gpt_dialog
View on GitHub
☆101Oct 10, 2020Updated 5 years ago
LG-1 / video_music_book_datasets
View on GitHub
NLP NER datasets video/music/book bio
☆90Jan 3, 2021Updated 5 years ago
yilifzf / BDCI_Car_2018
View on GitHub
BDCI 2018 汽车行业用户观点主题及情感识别决赛一等奖方案
☆430Dec 7, 2018Updated 7 years ago
taishan1994 / python3_wiki_word2vec
View on GitHub
基于python3训练中文wiki词向量、字向量、拼音向量
☆11Jan 2, 2022Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
hmnth1 / table_ocr
View on GitHub
☆13Oct 1, 2020Updated 5 years ago
ZhuiyiTechnology / simbert
View on GitHub
a bert for retrieval and generation
☆860Feb 26, 2021Updated 5 years ago
jonashao / written_judgement
View on GitHub
提取出判决书中的金额项和金额数。
☆11Apr 8, 2016Updated 10 years ago
WeblateOrg / hello
View on GitHub
Hello world demonstration for Weblate
☆15Jan 20, 2026Updated 6 months ago
vale-cli / SubVale
View on GitHub
A Sublime Text 3 client for Vale Server.
☆13Dec 7, 2020Updated 5 years ago
caishiqing / joint-mrc
View on GitHub
机器检索阅读联合学习，莱斯杯：全国第二届“军事智能机器阅读”挑战赛 rank6 方案
☆128Oct 20, 2020Updated 5 years ago
dongrixinyu / chinese_keyphrase_extractor
View on GitHub
An off-the-shelf tool for Chinese Keyphrase Extraction 一个快速从中文里抽取关键短语的工具，仅占35M内存 www.jionlp.com
☆554Nov 21, 2023Updated 2 years ago