IBM / MAX-Chinese-Phonetic-Similarity-Estimator
Estimate the phonetic distance between Chinese words and get similar sounding candidate words.
☆35Updated last year
Related projects ⓘ
Alternatives and complementary repositories for MAX-Chinese-Phonetic-Similarity-Estimator
- ☆120Updated 3 years ago
- 基于Pytorch 1.0 实现的中文断句与标点符号恢复。☆55Updated 5 years ago
- Use bert to predict punctuation on IWSLT2012 and The People's Daily 2014☆65Updated 4 years ago
- An open-access corpus of conversational bilingual speech in Cantonese and English☆40Updated 2 years ago
- A Bert-CNN-LSTM model for punctuation restoration☆55Updated last year
- 中文分词软件基准测试 | Chinese tokenizer benchmark☆23Updated 6 years ago
- 拼音转汉字, convert pinyin to 汉字 using deep networks☆22Updated 4 years ago
- ☆75Updated last year
- g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese☆238Updated 5 years ago
- A PyTorch implementation of a punctuation prediction system using (B)LSTM, which automatically adds suitable punctuation into text withou…☆61Updated 4 years ago
- Chinese text normalization. 中文文本规范化。☆48Updated 3 years ago
- python | 高效使用统计语言模型kenlm:新词发现、分词、智能纠错等☆162Updated 5 years ago
- A python module that convert chinese written string to read string. 一个python包:将中文书面字符串转换为口语字符串。☆118Updated 5 years ago
- soft_mask_bert model for Chinese Spelling Correction in keras☆21Updated 4 years ago
- 人民日报1998年1-4月中文标注语料库☆29Updated 6 years ago
- ChineseWord correct!!when you input some error words,return some maybe right word☆9Updated 9 years ago
- 基于mlm方式的带有纠错功能的拼音转汉字bert预训练模型,pinyin correcter,基于pytorch框架实现☆44Updated 3 years ago
- Mirror of SRILM☆53Updated 4 years ago
- TestB榜第10的方案,bleu32.1☆63Updated 4 years ago
- seq2seq PinYin to Chinese translator☆10Updated 6 years ago
- Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition☆18Updated 2 years ago
- 汉字字符特征提取工具,可以提取出字符中的字音(声母、韵母、声调)、字形(偏旁、部首)、四角编码等特征,同时可作为tensor输入到模型☆128Updated 4 years ago
- A Pytorch based LSTM Punctuation Restoration Implementation/A Simple Tutorial for Leaning Pytorch and NLP☆24Updated 3 years ago
- repo for Tibetan corpora☆21Updated last year
- ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET☆58Updated 2 years ago
- DistilBERT for Chinese 海量中文预训练蒸馏bert模型☆90Updated 4 years ago
- ☆56Updated 3 years ago