ku-nlp/KWDLC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ku-nlp/KWDLC)

ku-nlp / KWDLC

Kyoto University Web Document Leads Corpus

☆84

Alternatives and similar repositories for KWDLC

Users that are interested in KWDLC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ku-nlp / AnnotatedFKCCorpus
View on GitHub
Annotated Fuman Kaitori Center Corpus
☆18Dec 18, 2023Updated 2 years ago
ku-nlp / KyotoCorpus
View on GitHub
Kyoto University Text Corpus
☆71Jul 14, 2023Updated 3 years ago
ku-nlp / kwja
View on GitHub
An integrated Japanese analyzer based on foundation models
☆145Updated this week
jojonki / Taiyaki
View on GitHub
PythonとCythonで出来てる日本語形態素解析エンジン🚧
☆13Dec 4, 2019Updated 6 years ago
ku-nlp / kyoto-reader
View on GitHub
A processor for KyotoCorpus, KWDLC, and AnnotatedFKCCorpus
☆10Jun 26, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ku-nlp / bertknp
View on GitHub
A Japanese dependency parser based on BERT
☆23Oct 26, 2022Updated 3 years ago
megagonlabs / bunkai
View on GitHub
Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)
☆200Mar 26, 2024Updated 2 years ago
ku-nlp / knp
View on GitHub
A Japanese Parser
☆34Nov 1, 2023Updated 2 years ago
skozawa / Comainu
View on GitHub
COrpus based Morphological Analyzer with INtegrated User dictionary
☆21Mar 30, 2025Updated last year
megagonlabs / ginza-transformers
View on GitHub
Use custom tokenizers in spacy-transformers
☆16Aug 9, 2022Updated 3 years ago
megagonlabs / jrte-corpus
View on GitHub
Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)
☆77Jun 23, 2023Updated 3 years ago
ku-nlp / text-cleaning
View on GitHub
A powerful text cleaner for Japanese web texts
☆12Jan 20, 2024Updated 2 years ago
ku-nlp / pyknp
View on GitHub
A Python Module for JUMAN++/KNP
☆93Jan 8, 2026Updated 6 months ago
osekilab / JCoLA
View on GitHub
☆19Apr 21, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
1never / open2ch-dialogue-corpus
View on GitHub
おーぷん2ちゃんねるをクロールして作成した対話コーパス
☆101Jun 6, 2021Updated 5 years ago
nandenjin / itfdic
View on GitHub
A localized word dictionary asset for University of Tsukuba
☆12Sep 19, 2025Updated 10 months ago
tatHi / optok
View on GitHub
☆10Aug 26, 2021Updated 4 years ago
nobu-g / cohesion-analysis
View on GitHub
Code for COLING 2020 Paper
☆13Feb 3, 2026Updated 5 months ago
yahoojapan / JGLUE
View on GitHub
JGLUE: Japanese General Language Understanding Evaluation
☆346Mar 31, 2025Updated last year
himkt / pyner
View on GitHub
🌈 Implementation of Neural Network based Named Entity Recognizer (Lample+, 2016) using Chainer.
☆45Dec 8, 2022Updated 3 years ago
yagays / nayose-wikipedia-ja
View on GitHub
Wikipediaから作成した日本語名寄せデータセット
☆35Mar 10, 2020Updated 6 years ago
mjstrobl / WEXEA
View on GitHub
Wikipedia EXhaustive Entity Annotator (LREC 2020)
☆16Apr 22, 2024Updated 2 years ago
hppRC / japanese-sentence-breaker
View on GitHub
🧨 Japanese Sentence Breaker 🧨
☆14Jun 6, 2021Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
chakki-works / chariot
View on GitHub
Deliver the ready-to-train data to your NLP model.
☆123Jul 15, 2022Updated 4 years ago
tmu-nlp / JapaneseWordSimilarityDataset
View on GitHub
Japanese Word Similarity Dataset
☆103Dec 7, 2021Updated 4 years ago
megagonlabs / ebe-dataset
View on GitHub
Evidence-based Explanation Dataset (AACL-IJCNLP 2020)
☆18Dec 17, 2020Updated 5 years ago
UniversalDependencies / UD_Japanese-GSD
View on GitHub
Japanese data from the Google UDT 2.0.
☆40May 6, 2026Updated 2 months ago
masayu-a / NAIST-JENE
View on GitHub
☆10Aug 13, 2012Updated 13 years ago
ku-nlp / jumanpp
View on GitHub
Juman++ (a Morphological Analyzer Toolkit)
☆414Apr 17, 2026Updated 3 months ago
shyyhs / CourseraParallelCorpusMining
View on GitHub
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation
☆15Aug 27, 2024Updated last year
musyoku / unsupervised-pos-tagging
View on GitHub
教師なし品詞タグ推定
☆16Mar 22, 2018Updated 8 years ago
teaspn / teaspn-server
View on GitHub
A sample implementation of the TEASPN server
☆18Oct 31, 2019Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
taishi-i / nagisa
View on GitHub
A Japanese tokenizer based on recurrent neural networks
☆418Jul 6, 2026Updated 2 weeks ago
kuribayashi4 / span_based_argumentation_parser
View on GitHub
☆11Feb 2, 2023Updated 3 years ago
WorksApplications / SudachiTra
View on GitHub
Japanese tokenizer for Transformers
☆80Dec 15, 2023Updated 2 years ago
kenkov / cabocha
View on GitHub
CaboCha wrapper for Python3
☆46Jul 5, 2018Updated 8 years ago
hppRC / defsent
View on GitHub
DefSent: Sentence Embeddings using Definition Sentences
☆23Aug 5, 2021Updated 4 years ago
himkt / awesome-bert-japanese
View on GitHub
📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information
☆132Mar 15, 2023Updated 3 years ago
verypluming / JaNLI
View on GitHub
☆17May 31, 2023Updated 3 years ago