ku-nlp / text-cleaningView external linksLinks
A powerful text cleaner for Japanese web texts
☆12Jan 20, 2024Updated 2 years ago
Alternatives and similar repositories for text-cleaning
Users that are interested in text-cleaning are comparing it to the libraries listed below
Sorting:
- 基于中心度的中文关键短语抽取工具☆11Sep 2, 2022Updated 3 years ago
- NAACL'2021: Non-Parametric Few-Shot Learning for Word Sense Disambiguation☆10Jul 1, 2021Updated 4 years ago
- Yet another Python binding for Juman++/KNP/KWJA☆37Feb 2, 2026Updated last week
- 🧨 Japanese Sentence Breaker 🧨☆14Jun 6, 2021Updated 4 years ago
- ☆33Jul 31, 2024Updated last year
- Use custom tokenizers in spacy-transformers☆16Aug 9, 2022Updated 3 years ago
- A Japanese dependency parser based on BERT☆23Oct 26, 2022Updated 3 years ago
- Annotated Fuman Kaitori Center Corpus☆18Dec 18, 2023Updated 2 years ago
- Kyoto University Web Document Leads Corpus☆83Dec 18, 2023Updated 2 years ago
- Rakuten MA (Python version)☆23May 22, 2017Updated 8 years ago
- Yet another sentence-level tokenizer for the Japanese text☆24Nov 27, 2025Updated 2 months ago
- ☆29Apr 10, 2025Updated 10 months ago
- COMET-ATOMIC ja☆31Mar 8, 2024Updated last year
- Bluetooth plugin for Flutter☆10Dec 19, 2022Updated 3 years ago
- 日本語マルチタスク言語理解ベンチマーク Japanese Massive Multitask Language Understanding Benchmark☆38Oct 7, 2025Updated 4 months ago
- Chat with your data while uploading a pdf file and using a local LLM.☆11Mar 19, 2024Updated last year
- The framework for creating a new platform (like game engine).☆10Jan 11, 2026Updated last month
- ISDB-S3 fork☆10Dec 13, 2024Updated last year
- A collection of github workflow patterns☆10Feb 1, 2024Updated 2 years ago
- PowerShell によって Windows10 のキッティングに必要な全工程を自動的に完了。☆12Jun 10, 2025Updated 8 months ago
- OPI5 open micro desk design.☆13Mar 6, 2023Updated 2 years ago
- Tokenizer POS-tagger Lemmatizer and Dependency-parser for modern and contemporary Japanese☆38Dec 29, 2025Updated last month
- 無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン☆10Jan 30, 2023Updated 3 years ago
- 自分用ビルドスクリプト集☆10Aug 27, 2025Updated 5 months ago
- ☆11Oct 31, 2021Updated 4 years ago
- Dockerで構築するMirakurun + EDCB + KonomiTVなTV視聴・録画環境☆15Jan 18, 2026Updated 3 weeks ago
- TOTP (Time-based One-Time Password) authentication for Django REST Framework.☆13Feb 5, 2026Updated last week
- A library for evaluation of Grammatical Error Correction (GEC). Accepted to ACL'25 Demo: "gec-metrics: A Unified Library for Grammatical …☆14Jan 25, 2026Updated 2 weeks ago
- ☆10Aug 27, 2025Updated 5 months ago
- ATSC 3.0 to MPEG-2 TS Converter☆21Sep 11, 2025Updated 5 months ago
- ☆10Jun 24, 2022Updated 3 years ago
- MPEG-2 TS packect check☆12Jun 3, 2024Updated last year
- 「行動データの計算論モデリング」のサポートページです。☆11Mar 1, 2021Updated 4 years ago
- alpacaデータセットを日本語化したものです☆86Jun 3, 2023Updated 2 years ago
- Code for evaluating Japanese pretrained models provided by NTT Ltd.☆245Jun 21, 2023Updated 2 years ago
- my plugins, macros, ... for Glyphs.App☆10Nov 4, 2025Updated 3 months ago
- Voice synthesis library for Text-to-Speech applications (Currently HTS Engine rewrite in Rust language)☆13Feb 8, 2026Updated last week
- ☆10Dec 30, 2025Updated last month
- This tool automatically takes screenshots of your Kindle screen and saves them as files.☆15Feb 23, 2025Updated 11 months ago