daac-tools / vibrato
🎤 vibrato: Viterbi-based accelerated tokenizer
☆350Updated last month
Alternatives and similar repositories for vibrato:
Users that are interested in vibrato are comparing it to the libraries listed below
- 🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer☆233Updated this week
- Sudachi in Rust 🦀 and new generation of SudachiPy☆335Updated last month
- A lexicon for Sudachi☆241Updated 3 weeks ago
- A multilingual morphological analysis library.☆433Updated last week
- Japanese Morphological Analyzer written in Rust☆96Updated last month
- Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)☆189Updated 10 months ago
- 日本で Rust を利用している会社一覧☆308Updated 2 weeks ago
- Visual Studio Code で小説を執筆する時に使う言語拡張です。☆405Updated 2 weeks ago
- A comparison tool of Japanese tokenizers☆120Updated 8 months ago
- English-Japanese Dictionary data (Public Domain) EJDict-hand☆204Updated last year
- Japanese text normalizer for mecab-neologd☆278Updated 2 weeks ago
- A tool for dividing the Japanese full name into a family name and a given name.☆245Updated 4 months ago
- Rust文書の和訳レポジトリ☆300Updated last month
- 【2023年版】BERTによるテキスト分類☆231Updated 8 months ago
- General-purpose Swich transformer based Japanese language model☆117Updated last year
- Japanese word embedding with Sudachi and NWJC 🌿☆158Updated 11 months ago
- NDLOCRアプリケーションのリポジトリ(ソースコードを含む)☆368Updated last week
- Viterbi-based accelerated tokenizer (Python wrapper)☆41Updated 5 months ago
- 💻 notebookjp - ゲーム開発・プログラミングにおすすめの低予算ノート PC 📊☆298Updated 2 months ago
- MP4 library☆126Updated 2 months ago
- An integrated Japanese analyzer based on foundation models☆131Updated 4 months ago
- Yet another Japanese IME for IBus/Linux☆217Updated last year
- Safe Rust bindings for mecab a part-of-speech and morphological analyzer library☆61Updated last year
- Neologism dictionary based on the language resources on the Web for mecab-unidic☆84Updated 4 years ago
- Tsurugi - next generation RDB for the new era☆373Updated this week
- aozorahack全般に関するissue/wiki用リポジトリです☆162Updated 9 years ago
- 自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器☆138Updated last year
- Japanese text8 corpus for word embedding.☆110Updated 7 years ago
- おーぷん2ちゃんねるをクロールして作成した対話コーパス☆95Updated 3 years ago
- Juman++ (a Morphological Analyzer Toolkit)☆384Updated last year