daac-tools / python-vaporetto
🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.
☆21Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for python-vaporetto
- This repository has implementations of data augmentation for NLP for Japanese.☆64Updated last year
- Japanese synonym library☆52Updated 2 years ago
- Funer is Rule based Named Entity Recognition tool.☆22Updated 2 years ago
- Repository for JSICK☆44Updated last year
- Japanese tokenizer for Transformers☆78Updated 11 months ago
- ☆50Updated last year
- Code for COLING 2020 Paper☆13Updated 2 weeks ago
- Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)☆76Updated last year
- ☆24Updated 2 weeks ago
- 敬語変換タスクにおける評価用データセット☆20Updated 2 years ago
- ☆36Updated 3 years ago
- Sentence Embeddings with BERT & XLNet☆32Updated last year
- Training and evaluation scripts for JGLUE, a Japanese language understanding benchmark☆17Updated 2 weeks ago
- Japanese-BPEEncoder☆39Updated 3 years ago
- Finding all pairs of similar documents time- and memory-efficiently☆58Updated 2 years ago
- ☆13Updated 2 years ago
- ☆16Updated 3 years ago
- ☆34Updated 5 years ago
- ☆28Updated 2 years ago
- Wikipediaを用いた日本語の固有表現抽出データセット☆132Updated last year
- Japanese data from the Google UDT 2.0.☆28Updated last year
- Wikipediaから作成した日本語名寄せデータセット☆34Updated 4 years ago
- ☆18Updated 6 months ago
- 日本語T5モデル☆113Updated 2 months ago
- Japanese BERT Pretrained Model☆22Updated 3 years ago
- Exploring Japanese SimCSE☆62Updated last year
- ☆30Updated 6 years ago
- 【2024年版】BERTによるテキスト分類☆24Updated 4 months ago
- Flexible evaluation tool for language models☆36Updated this week
- Mecab + NEologd + Docker + Python3☆35Updated 2 years ago