Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)
☆199Mar 26, 2024Updated 2 years ago
Alternatives and similar repositories for bunkai
Users that are interested in bunkai are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Japanese synonym library☆55Feb 7, 2022Updated 4 years ago
- 📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information☆132Mar 15, 2023Updated 3 years ago
- A tool for visualizing the internal structures of morphological analyzer Sudachi☆18Jun 9, 2022Updated 3 years ago
- 🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.☆261May 19, 2026Updated last week
- Wikipediaから作成した日本語名寄せデータセット☆35Mar 10, 2020Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A tool for comparing tokenizers☆123Nov 9, 2025Updated 6 months ago
- japanese sentence segmentation library for python☆74Apr 3, 2023Updated 3 years ago
- Kyoto University Web Document Leads Corpus☆84Dec 18, 2023Updated 2 years ago
- Japanese tokenizer for Transformers☆79Dec 15, 2023Updated 2 years ago
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Apr 9, 2024Updated 2 years ago
- Funer is Rule based Named Entity Recognition tool.☆22Apr 21, 2022Updated 4 years ago
- Code for COLING 2020 Paper☆13Feb 3, 2026Updated 3 months ago
- This repository has implementations of data augmentation for NLP for Japanese.☆64Feb 16, 2023Updated 3 years ago
- A Japanese NLP Library using spaCy as framework based on Universal Dependencies☆851Mar 30, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension (2022, Skelter Labs)☆108Mar 2, 2022Updated 4 years ago
- JGLUE: Japanese General Language Understanding Evaluation☆342Mar 31, 2025Updated last year
- Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)☆77Jun 23, 2023Updated 2 years ago
- ☆99Jul 23, 2023Updated 2 years ago
- 🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer☆288Feb 7, 2026Updated 3 months ago
- The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.☆126Apr 10, 2026Updated last month
- Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.☆89Nov 3, 2023Updated 2 years ago
- PythonとCythonで出来てる日本語形態素解析エンジン🚧☆13Dec 4, 2019Updated 6 years ago
- Japanese Word Similarity Dataset☆103Dec 7, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆30Apr 10, 2025Updated last year
- 日本語CLIPモデル☆13Sep 15, 2025Updated 8 months ago
- https://www.nlp.ecei.tohoku.ac.jp/projects/aio/☆16Aug 4, 2022Updated 3 years ago
- Annotated Fuman Kaitori Center Corpus☆18Dec 18, 2023Updated 2 years ago
- A lexicon for Sudachi☆296Apr 30, 2026Updated 3 weeks ago
- Code for evaluating Japanese pretrained models provided by NTT Ltd.☆246Jun 21, 2023Updated 2 years ago
- Yet another sentence-level tokenizer for the Japanese text