WorksApplications/uzushio

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WorksApplications/uzushio)

WorksApplications / uzushio

☆24

Alternatives and similar repositories for uzushio

Users that are interested in uzushio are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nobu-g / cohesion-analysis
View on GitHub
Code for COLING 2020 Paper
☆13Feb 3, 2026Updated 5 months ago
WorksApplications / SudachiTra
View on GitHub
Japanese tokenizer for Transformers
☆80Dec 15, 2023Updated 2 years ago
megagonlabs / asdc
View on GitHub
Accommodation Search Dialog Corpus (宿泊施設探索対話コーパス)
☆25Jan 19, 2024Updated 2 years ago
llm-jp / llm-jp-sft
View on GitHub
☆62Jun 13, 2024Updated 2 years ago
singletongue / wikipedia-utils
View on GitHub
Utility scripts for preprocessing Wikipedia texts for NLP
☆78Apr 9, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
HojiChar / HojiChar
View on GitHub
The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.
☆128Updated this week
osekilab / JCoLA
View on GitHub
☆19Apr 21, 2026Updated 3 months ago
neuml / magnitude
View on GitHub
Magnitude fork that only supports Word2Vec, GloVe and fastText embeddings
☆13Aug 11, 2020Updated 5 years ago
megagonlabs / jrte-corpus
View on GitHub
Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)
☆77Jun 23, 2023Updated 3 years ago
colorfulscoop / sbert-ja
View on GitHub
Code to train Sentence BERT Japanese model for Hugging Face Model Hub
☆11Aug 8, 2021Updated 4 years ago
lighttransport / japanese-llama-experiment
View on GitHub
Japanese LLaMa experiment
☆54Dec 27, 2025Updated 6 months ago
Taishi-N324 / Drop-Upcycling
View on GitHub
[ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
☆24Oct 5, 2025Updated 9 months ago
DaisukeBekki / JSeM
View on GitHub
Japanese semantic test suite (FraCaS counterpart and extensions)
☆13Apr 21, 2026Updated 3 months ago
inspection-ai / japanese-toxic-dataset
View on GitHub
☆22Jan 11, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
WorksApplications / ViSudachi
View on GitHub
A tool for visualizing the internal structures of morphological analyzer Sudachi
☆18Jun 9, 2022Updated 4 years ago
t-sagara / Japanese-Address-testdata
View on GitHub
解析が難しい日本の住所のテストデータセット
☆14Sep 25, 2023Updated 2 years ago
kenta1984 / wrd
View on GitHub
☆23Sep 18, 2020Updated 5 years ago
yahoojapan / VFD-Dataset
View on GitHub
☆11Nov 10, 2020Updated 5 years ago
kzinmr / transformers_ner_ja
View on GitHub
Japanese NER with Transformers + PyTorch-Lightning + MLflow Tracking
☆15Nov 20, 2022Updated 3 years ago
stockmarkteam / ner-wikipedia-dataset
View on GitHub
Wikipediaを用いた日本語の固有表現抽出データセット
☆143Sep 2, 2023Updated 2 years ago
msnoigrs / gosudachi
View on GitHub
Go porting of Sudachi
☆33Feb 13, 2022Updated 4 years ago
nu-dialogue / jmultiwoz
View on GitHub
JMultiWOZ: A Large-Scale Japanese Multi-Domain Task-Oriented Dialogue Dataset, LREC-COLING 2024
☆25Mar 27, 2024Updated 2 years ago
ku-nlp / AnnotatedFKCCorpus
View on GitHub
Annotated Fuman Kaitori Center Corpus
☆18Dec 18, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
alinear-corp / albert-japanese
View on GitHub
BERT with SentencePiece for Japanese text.
☆33Oct 28, 2021Updated 4 years ago
megagonlabs / bunkai
View on GitHub
Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)
☆200Mar 26, 2024Updated 2 years ago
himkt / awesome-bert-japanese
View on GitHub
📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information
☆132Mar 15, 2023Updated 3 years ago
ueda-keisuke / CC-CEDICT-MeCab
View on GitHub
CC-CEDICT-MeCab is a MeCab dictionary for Chinese (Mandarin) text segmentation
☆13Apr 9, 2020Updated 6 years ago
megagonlabs / ebe-dataset
View on GitHub
Evidence-based Explanation Dataset (AACL-IJCNLP 2020)
☆18Dec 17, 2020Updated 5 years ago
aiishii / JEMHopQA
View on GitHub
☆30Apr 10, 2025Updated last year
yahoojapan / JGLUE
View on GitHub
JGLUE: Japanese General Language Understanding Evaluation
☆346Mar 31, 2025Updated last year
Yuki-Tanaka-33937424 / kaggle-Shopee-Price-Match-Guarantee
View on GitHub
Kaggleのshopeeコンペのリポジトリ
☆11Jun 7, 2021Updated 5 years ago
daac-tools / vaporetto
View on GitHub
🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer
☆295Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yuzu-ai / japanese-llm-ranking
View on GitHub
☆50Apr 10, 2024Updated 2 years ago
llm-jp / llm-jp-eval
View on GitHub
☆164Updated this week
megagonlabs / instruction_ja
View on GitHub
Japanese instruction data (日本語指示データ)
☆24Jul 13, 2023Updated 3 years ago
hmiyado / four-keys
View on GitHub
A CLI tool to measure four keys metrics and analyze development performance
☆19Jul 1, 2026Updated 2 weeks ago
conditional / jawikify
View on GitHub
日本語テキストに対する wikification のためのソフトウェア
☆17Mar 14, 2017Updated 9 years ago
r9y9 / open_jtalk
View on GitHub
A fork of open_jtalk
☆71Mar 31, 2025Updated last year
utanaka2000 / fairseq
View on GitHub
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
☆25Mar 16, 2021Updated 5 years ago