yagays/nayose-wikipedia-ja

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yagays/nayose-wikipedia-ja)

yagays / nayose-wikipedia-ja

Wikipediaから作成した日本語名寄せデータセット

☆35

Alternatives and similar repositories for nayose-wikipedia-ja

Users that are interested in nayose-wikipedia-ja are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nandenjin / itfdic
View on GitHub
A localized word dictionary asset for University of Tsukuba
☆12Sep 19, 2025Updated 10 months ago
ujiuji1259 / shinra-attribute-extraction
View on GitHub
☆11Sep 7, 2021Updated 4 years ago
ikegami-yukino / zunda-python
View on GitHub
Zunda: Japanese Enhanced Modality Analyzer client for Python.
☆10Nov 30, 2019Updated 6 years ago
masayu-a / NAIST-JENE
View on GitHub
☆10Aug 13, 2012Updated 13 years ago
ikegami-yukino / sengiri
View on GitHub
Yet another sentence-level tokenizer for the Japanese text
☆24Nov 27, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
chakki-works / Japanese-Company-Lexicon
View on GitHub
☆99Jul 23, 2023Updated 3 years ago
SkelterLabsInc / JaQuAD
View on GitHub
JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension (2022, Skelter Labs)
☆111Mar 2, 2022Updated 4 years ago
sonoisa / clip-japanese
View on GitHub
日本語CLIPモデル
☆13Sep 15, 2025Updated 10 months ago
yagays / embedrank
View on GitHub
Python Implementation of EmbedRank
☆48Mar 19, 2019Updated 7 years ago
conditional / jawikify
View on GitHub
日本語テキストに対する wikification のためのソフトウェア
☆17Mar 14, 2017Updated 9 years ago
ikegami-yukino / asa-python
View on GitHub
Japanese Argument Structure Analyzer (ASA) client for Python
☆11Feb 16, 2019Updated 7 years ago
hiroshi-manabe / CRFSegmenter
View on GitHub
A multi-language segmenter using high-order CRF.
☆17Feb 27, 2020Updated 6 years ago
megagonlabs / ginza-transformers
View on GitHub
Use custom tokenizers in spacy-transformers
☆16Aug 9, 2022Updated 3 years ago
megagonlabs / ebe-dataset
View on GitHub
Evidence-based Explanation Dataset (AACL-IJCNLP 2020)
☆18Dec 17, 2020Updated 5 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
chemicaltree / tetra
View on GitHub
☆10Sep 14, 2022Updated 3 years ago
megagonlabs / bunkai
View on GitHub
Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)
☆200Mar 26, 2024Updated 2 years ago
yagays / alacarte_embedding
View on GitHub
Python implementation of A La Carte Embedding
☆10Dec 7, 2018Updated 7 years ago
wwwcojp / ja_sentence_segmenter
View on GitHub
japanese sentence segmentation library for python
☆76Updated this week
BandaiNamcoResearchInc / DistilBERT-base-jp
View on GitHub
☆161Oct 19, 2020Updated 5 years ago
nobu-g / cohesion-analysis
View on GitHub
Code for COLING 2020 Paper
☆13Feb 3, 2026Updated 5 months ago
tatHi / optok
View on GitHub
☆10Aug 26, 2021Updated 4 years ago
hiroki13 / neural-pasa-system
View on GitHub
☆13Apr 23, 2017Updated 9 years ago
megagonlabs / jrte-corpus
View on GitHub
Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)
☆77Jun 23, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
osuossu8 / CommonLitReadabilityPrize
View on GitHub
☆14Aug 3, 2021Updated 4 years ago
PKSHATechnology-Research / camphr
View on GitHub
Camphr - NLP libary for creating pipeline components
☆336Dec 9, 2022Updated 3 years ago
taishi-i / toiro
View on GitHub
A tool for comparing tokenizers
☆122Nov 9, 2025Updated 8 months ago
megagonlabs / UD_Japanese-GSD
View on GitHub
Japanese data from the Google UDT 2.0.
☆28Mar 24, 2023Updated 3 years ago
akirakubo / bert-japanese-aozora
View on GitHub
Japanese BERT trained on Aozora Bunko and Wikipedia, pre-tokenized by MeCab with UniDic & SudachiPy
☆40Aug 8, 2020Updated 5 years ago
himkt / awesome-bert-japanese
View on GitHub
📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information
☆132Mar 15, 2023Updated 3 years ago
katryo / wordnet_python
View on GitHub
日本語版wordnetをPythonで扱うためのラッパー
☆26Jan 20, 2014Updated 12 years ago
jojonki / Taiyaki
View on GitHub
PythonとCythonで出来てる日本語形態素解析エンジン🚧
☆13Dec 4, 2019Updated 6 years ago
kajyuuen / daaja
View on GitHub
This repository has implementations of data augmentation for NLP for Japanese.
☆64Feb 16, 2023Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
kivantium / rnn-twitter
View on GitHub
☆12Jan 21, 2017Updated 9 years ago
nocotan / chainer-examples
View on GitHub
This is implementation examples by Chainer.
☆11Apr 7, 2018Updated 8 years ago
arosh / BM25Transformer
View on GitHub
(Python) transform a document-term matrix to an Okapi/BM25 representation
☆54Apr 17, 2018Updated 8 years ago
tanreinama / RoBERTa-japanese
View on GitHub
Japanese BERT Pretrained Model
☆23Nov 13, 2021Updated 4 years ago
yagays / swem
View on GitHub
Python implementation of SWEM (Simple Word-Embedding-based Methods)
☆30Jun 21, 2022Updated 4 years ago
ku-nlp / AnnotatedFKCCorpus
View on GitHub
Annotated Fuman Kaitori Center Corpus
☆18Dec 18, 2023Updated 2 years ago
akirakubo / mecab-mozcdic
View on GitHub
☆10Jan 12, 2018Updated 8 years ago