nlp-waseda / Kanbun-LM
Code for paper "Kanbun-LM: Reading and Translating Classical Chinese in Japanese Method by Language Models"
☆16Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Kanbun-LM
- Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa models for Japanese and other languages☆48Updated last month
- Unidic packaged for installation via pip.☆79Updated last year
- ☆28Updated last week
- A powerful text cleaner for Japanese web texts☆12Updated 10 months ago
- A small version of UniDic for easy pip installs.☆39Updated 4 years ago
- Tokenizer POS-tagger Lemmatizer and Dependency-parser for modern and contemporary Japanese with BERT models☆17Updated 4 months ago
- Yet another Python binding for Juman++/KNP/KWJA☆31Updated last month
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆62Updated 3 weeks ago
- A library for semantic similarity search☆23Updated 2 months ago
- Kyoto University Text Corpus☆59Updated last year
- The Business Scene Dialogue corpus☆68Updated 3 years ago
- Hanja Understanding Evaluation Dataset☆13Updated 2 years ago
- Trials of pre-trained BERT models for the medical domain in Japanese.☆12Updated 4 years ago
- Swallowプロジェクト 大規模言語モデル 評価スクリプト☆10Updated 4 months ago
- 日本語文法誤り訂正ツール☆27Updated 2 years ago
- Classical Chinese to Modern Japanese Translator☆25Updated 10 months ago
- Scripts for creating a Japanese-English parallel corpus and training NMT models☆15Updated 3 years ago
- COMET-ATOMIC ja☆28Updated 8 months ago
- ☆46Updated last year
- Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment☆36Updated last year
- Utility scripts for preprocessing Wikipedia texts for NLP☆76Updated 7 months ago
- DIRECT: Direct and Indirect REsponses in Conversational Text Corpus☆16Updated 3 years ago
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆13Updated last year
- ☆21Updated last week
- ☆25Updated 5 months ago
- Codes to pre-train Japanese T5 models☆40Updated 3 years ago
- 日本語マルチタスク言語理解ベンチマーク Japanese Massive Multitask Language Understanding Benchmark☆25Updated 8 months ago
- Tokenizer POS-tagger Lemmatizer and Dependency-parser for modern and contemporary Japanese☆34Updated this week
- Japanese data from the Google UDT 2.0.☆36Updated last week
- You can create datasets from Wikia/Wikipedia that can be used for entity recognition and Entity Linking. Dumps for ja-wiki and VTuber-wik…☆17Updated 3 years ago