UniversalDependencies/UD_Japanese-GSD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UniversalDependencies/UD_Japanese-GSD)

UniversalDependencies / UD_Japanese-GSD

Japanese data from the Google UDT 2.0.

☆40

Alternatives and similar repositories for UD_Japanese-GSD

Users that are interested in UD_Japanese-GSD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

masayu-a / NAIST-JENE
View on GitHub
☆10Aug 13, 2012Updated 13 years ago
ryanmcd / uni-dep-tb
View on GitHub
A set of treebanks for multiple languages annotated in basic Stanford-style dependencies.
☆68Aug 29, 2019Updated 6 years ago
KoichiYasuoka / UniDic2UD
View on GitHub
Tokenizer POS-tagger Lemmatizer and Dependency-parser for modern and contemporary Japanese
☆38Dec 29, 2025Updated 6 months ago
megagonlabs / UD_Japanese-GSD
View on GitHub
Japanese data from the Google UDT 2.0.
☆28Mar 24, 2023Updated 3 years ago
megagonlabs / jrte-corpus
View on GitHub
Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)
☆77Jun 23, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
himkt / awesome-bert-japanese
View on GitHub
📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information
☆132Mar 15, 2023Updated 3 years ago
ku-nlp / KWDLC
View on GitHub
Kyoto University Web Document Leads Corpus
☆84Dec 18, 2023Updated 2 years ago
akirakubo / bert-japanese-aozora
View on GitHub
Japanese BERT trained on Aozora Bunko and Wikipedia, pre-tokenized by MeCab with UniDic & SudachiPy
☆40Aug 8, 2020Updated 5 years ago
musyoku / hpylm
View on GitHub
HPYLMのC++実装
☆11May 2, 2017Updated 9 years ago
megagonlabs / ginza-transformers
View on GitHub
Use custom tokenizers in spacy-transformers
☆16Aug 9, 2022Updated 3 years ago
cl-tohoku / bert-japanese
View on GitHub
BERT models for Japanese text.
☆550Mar 23, 2024Updated 2 years ago
anqafalak / japkatsuyou
View on GitHub
A library that helps conjugate Japanese verbs. This repo contains two Qt projects: libjpconj witch is the library, and jpconj implements …
☆12Aug 29, 2017Updated 8 years ago
ku-nlp / AnnotatedFKCCorpus
View on GitHub
Annotated Fuman Kaitori Center Corpus
☆18Dec 18, 2023Updated 2 years ago
jqk09a / japanese-daily-dialogue
View on GitHub
☆59Mar 17, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
mattiasarro / confr
View on GitHub
Configuration system geared towards Python ML projects
☆11Apr 30, 2023Updated 3 years ago
octanove / shiba
View on GitHub
Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.
☆89Nov 3, 2023Updated 2 years ago
ikegami-yukino / zunda-python
View on GitHub
Zunda: Japanese Enhanced Modality Analyzer client for Python.
☆10Nov 30, 2019Updated 6 years ago
leia-llm / leia
View on GitHub
LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
☆23Apr 24, 2024Updated 2 years ago
WorksApplications / SudachiTra
View on GitHub
Japanese tokenizer for Transformers
☆81Dec 15, 2023Updated 2 years ago
taishi-i / nagisa
View on GitHub
A Japanese tokenizer based on recurrent neural networks
☆418Jul 6, 2026Updated 2 weeks ago
yagays / pytorch_bert_japanese
View on GitHub
☆35Aug 20, 2020Updated 5 years ago
mokejp / holidays_jp
View on GitHub
日本の祝日祝祭日を計算するPythonライブラリ
☆15Jul 25, 2022Updated 4 years ago
kajyuuen / daaja
View on GitHub
This repository has implementations of data augmentation for NLP for Japanese.
☆64Feb 16, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
WorksApplications / chiVe
View on GitHub
Japanese word embedding with Sudachi and NWJC 🌿
☆177Mar 1, 2024Updated 2 years ago
ku-nlp / kwja
View on GitHub
An integrated Japanese analyzer based on foundation models
☆145Jul 18, 2026Updated last week
taishi-i / toiro
View on GitHub
A tool for comparing tokenizers
☆122Nov 9, 2025Updated 8 months ago
yoheikikuta / bert-japanese
View on GitHub
BERT with SentencePiece for Japanese text.
☆498Feb 15, 2021Updated 5 years ago
hottolink / hottoSNS-bert
View on GitHub
hottoSNS-BERT: 大規模SNSコーパスによる文分散表現モデル
☆62Jan 22, 2026Updated 6 months ago
yagays / nayose-wikipedia-ja
View on GitHub
Wikipediaから作成した日本語名寄せデータセット
☆35Mar 10, 2020Updated 6 years ago
megagonlabs / bunkai
View on GitHub
Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)
☆200Mar 26, 2024Updated 2 years ago
shyyhs / CourseraParallelCorpusMining
View on GitHub
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
☆15Aug 27, 2024Updated last year
tsuruoka-lab / BSD
View on GitHub
The Business Scene Dialogue corpus
☆75Nov 10, 2021Updated 4 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ou-medinfo / medbertjp
View on GitHub
Trials of pre-trained BERT models for the medical domain in Japanese.
☆13Nov 21, 2020Updated 5 years ago
Katsumata420 / wikihow_japanese
View on GitHub
☆35Dec 17, 2020Updated 5 years ago
soskek / chainer-openai-transformer-lm
View on GitHub
A Chainer implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
☆28Jun 20, 2018Updated 8 years ago
viswavi / languageid
View on GitHub
Identifying the language of input text using character-level n-grams, with support for 45 languages
☆11Dec 26, 2022Updated 3 years ago
t-sagara / Japanese-Address-testdata
View on GitHub
解析が難しい日本の住所のテストデータセット
☆14Sep 25, 2023Updated 2 years ago
WorksApplications / SudachiPy
View on GitHub
Python version of Sudachi, a Japanese tokenizer.
☆442Oct 7, 2022Updated 3 years ago
aiishii / JEMHopQA
View on GitHub
☆30Apr 10, 2025Updated last year