☆11Aug 26, 2021Updated 4 years ago
Alternatives and similar repositories for optok
Users that are interested in optok are comparing it to the libraries listed below
Sorting:
- 🦀 A Rust implementation of a RoBERTa classification model for the SNLI dataset☆13Sep 13, 2021Updated 4 years ago
- DefSent: Sentence Embeddings using Definition Sentences☆22Aug 5, 2021Updated 4 years ago
- YAST - Yet Another SPLADE or Sparse Trainer☆21Jun 16, 2025Updated 8 months ago
- TIFMO: Textual Inference Forward-chaining MOdule☆12Apr 25, 2014Updated 11 years ago
- HPYLMのC++実装☆11May 2, 2017Updated 8 years ago
- A collection of various NLP datasets, mainly Indonesia-related languages.☆15Apr 23, 2022Updated 3 years ago
- Code for PyCon JP 2019 talk "Python による日本語自然言語処理 〜系列ラベリングによる実世界テキスト分析〜"☆48Nov 7, 2019Updated 6 years ago
- Rocker is a minimal docker implementation for educational purposes.☆19Apr 18, 2021Updated 4 years ago
- A tool for visualizing the internal structures of morphological analyzer Sudachi☆18Jun 9, 2022Updated 3 years ago
- Wikipediaから作成した日本語名寄せデータセット☆35Mar 10, 2020Updated 5 years ago
- Implementation of "Neural Word Embedding as Implicit Matrix Factorization"☆14Mar 17, 2022Updated 3 years ago
- Now it is exported as an official example☆13Jan 24, 2018Updated 8 years ago
- ☯️ AllenNLP training configurations for promising models on Named Entity Recognition. (BiLSTM-CRF, BiLSTM-CNN-CRF, BERT, BERT-CRF)☆15Nov 26, 2020Updated 5 years ago
- 教師なし品詞タグ推定☆16Mar 22, 2018Updated 7 years ago
- ☆17May 31, 2023Updated 2 years ago
- Word Rotator's Distance☆19Sep 5, 2021Updated 4 years ago
- A Japanese dependency parser based on BERT☆23Oct 26, 2022Updated 3 years ago
- ☆19May 23, 2024Updated last year
- A Streamlit component based on React Image Crop.☆21Jun 1, 2022Updated 3 years ago
- Rust implementation of SIF and uSIF: Simple and fast sentence embedding☆19Jan 22, 2025Updated last year
- PyTorch implementation of NAACL 2021 paper "Multi-view Subword Regularization"☆26Jun 2, 2021Updated 4 years ago
- Kyoto University Web Document Leads Corpus☆83Dec 18, 2023Updated 2 years ago
- Discovering Universal Geometry in Embeddings with ICA (Published in EMNLP 2023)☆20Jun 17, 2025Updated 8 months ago
- An annotation tool for grounding of formulae☆24May 28, 2024Updated last year
- Japanese synonym library☆55Feb 7, 2022Updated 4 years ago
- DSPy-powered email optimization for startup founders: drop in your 3 best emails, get optimized outreach for new leads☆39Sep 14, 2025Updated 5 months ago
- A Japanese Morphological Analyzer written in pure Rust☆26Oct 25, 2019Updated 6 years ago
- Finding all pairs of similar documents time- and memory-efficiently☆62Mar 13, 2025Updated 11 months ago
- Japanese Movie Recommendation Dialogue dataset☆29Jul 19, 2022Updated 3 years ago
- Testing tool to verify the search qualities of the Elasticsearch indices☆29Jan 8, 2023Updated 3 years ago
- The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.☆125Nov 13, 2025Updated 3 months ago
- Kyoto University Text Corpus☆69Jul 14, 2023Updated 2 years ago
- ☆43Sep 1, 2021Updated 4 years ago
- ☆34Jan 6, 2018Updated 8 years ago
- minimal seq2seq of keras☆25Jun 17, 2017Updated 8 years ago
- Code for "Re-evaluating Word Mover’s Distance" (ICML 2022)☆40Jun 15, 2022Updated 3 years ago
- Learned string similarity for entity names using optimal transport.☆35Nov 17, 2020Updated 5 years ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆35Aug 9, 2023Updated 2 years ago
- The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)☆84Jan 6, 2026Updated last month