joisino/wordtour

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/joisino/wordtour)

joisino / wordtour

Code for "Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem" (NAACL 2022)

☆113

Alternatives and similar repositories for wordtour

Users that are interested in wordtour are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ujiuji1259 / shinra-attribute-extraction
View on GitHub
☆11Sep 7, 2021Updated 4 years ago
sonoisa / clip-japanese
View on GitHub
日本語CLIPモデル
☆13Sep 15, 2025Updated 10 months ago
kampersanda / sif-embedding
View on GitHub
Rust implementation of SIF and uSIF: Simple and fast sentence embedding
☆19Jan 22, 2025Updated last year
bonprosoft / pysen-ls
View on GitHub
A language server implementation for pysen
☆10Nov 14, 2021Updated 4 years ago
lighttransport / jagger-python
View on GitHub
Python binding for Jagger(C++ implementation of Pattern-based Japanese Morphological Analyzer)
☆13Dec 16, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
himkt / awesome-bert-japanese
View on GitHub
📝 A list of pre-trained BERT models for Japanese with word/subword tokenization + vocabulary construction algorithm information
☆132Mar 15, 2023Updated 3 years ago
tarotez / pyml
View on GitHub
Machine learning course using Python
☆13Apr 26, 2022Updated 4 years ago
sbintuitions / flexeval
View on GitHub
Flexible evaluation tool for language models
☆61Updated this week
WorksApplications / chikkarpy
View on GitHub
Japanese synonym library
☆55Feb 7, 2022Updated 4 years ago
shimo-lab / Universal-Geometry-with-ICA
View on GitHub
Discovering Universal Geometry in Embeddings with ICA (Published in EMNLP 2023)
☆22Jun 17, 2025Updated last year
pfnet-research / jfbench
View on GitHub
☆15Mar 12, 2026Updated 4 months ago
megagonlabs / UD_Japanese-GSD
View on GitHub
Japanese data from the Google UDT 2.0.
☆28Mar 24, 2023Updated 3 years ago
ku-nlp / kwja
View on GitHub
An integrated Japanese analyzer based on foundation models
☆145Jul 18, 2026Updated last week
sbintuitions / JMTEB
View on GitHub
The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)
☆93Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
izuna385 / Wikia-and-Wikipedia-EL-Dataset-Creator
View on GitHub
You can create datasets from Wikia/Wikipedia that can be used for entity recognition and Entity Linking. Dumps for ja-wiki and VTuber-wik…
☆18May 2, 2021Updated 5 years ago
hotchpotch / yast
View on GitHub
YAST - Yet Another SPLADE or Sparse Trainer
☆21Jun 16, 2025Updated last year
malteos / scincl
View on GitHub
Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)
☆79Dec 29, 2025Updated 7 months ago
tatHi / optok
View on GitHub
☆10Aug 26, 2021Updated 4 years ago
yahoojapan / JGLUE
View on GitHub
JGLUE: Japanese General Language Understanding Evaluation
☆346Mar 31, 2025Updated last year
stockmarkteam / ner-wikipedia-dataset
View on GitHub
Wikipediaを用いた日本語の固有表現抽出データセット
☆143Sep 2, 2023Updated 2 years ago
yukiar / OTAlign
View on GitHub
Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment
☆38Sep 13, 2023Updated 2 years ago
kensho-technologies / pathpiece
View on GitHub
PathPiece tokenizer
☆14Nov 10, 2024Updated last year
WorksApplications / SudachiTra
View on GitHub
Japanese tokenizer for Transformers
☆81Dec 15, 2023Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
octanove / shiba
View on GitHub
Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.
☆89Nov 3, 2023Updated 2 years ago
asakura-data-science / finance
View on GitHub
☆21Feb 28, 2022Updated 4 years ago
sonoisa / t5-japanese
View on GitHub
日本語T5モデル
☆118Sep 15, 2025Updated 10 months ago
nobu-g / cohesion-analysis
View on GitHub
Code for COLING 2020 Paper
☆13Feb 3, 2026Updated 5 months ago
tetsuwaka / CausalExtraction
View on GitHub
☆20Jul 26, 2025Updated last year
joisino / reeval-wmd
View on GitHub
Code for "Re-evaluating Word Mover’s Distance" (ICML 2022)
☆40Jun 15, 2022Updated 4 years ago
HojiChar / HojiChar
View on GitHub
The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.
☆128Jul 17, 2026Updated last week
kajyuuen / funer
View on GitHub
Funer is Rule based Named Entity Recognition tool.
☆22Apr 21, 2022Updated 4 years ago
SakanaAI / TAID
View on GitHub
Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"
☆123Oct 6, 2025Updated 9 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
osekilab / JCoLA
View on GitHub
☆19Apr 21, 2026Updated 3 months ago
sarulab-speech / ml-audiocaps
View on GitHub
Multi-lingual AudioCaps
☆14Nov 20, 2023Updated 2 years ago
tanreinama / RoBERTa-japanese
View on GitHub
Japanese BERT Pretrained Model
☆23Nov 13, 2021Updated 4 years ago
cl-tohoku / keigo_transfer_task
View on GitHub
敬語変換タスクにおける評価用データセット
☆21Nov 24, 2022Updated 3 years ago
retarfi / language-pretraining
View on GitHub
Pre-training Language Models for Japanese
☆50Jul 2, 2023Updated 3 years ago
kajyuuen / daaja
View on GitHub
This repository has implementations of data augmentation for NLP for Japanese.
☆64Feb 16, 2023Updated 3 years ago
daac-tools / vaporetto
View on GitHub
🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer
☆297Jul 20, 2026Updated last week