Discovering Universal Geometry in Embeddings with ICA (Published in EMNLP 2023)
☆20Jun 17, 2025Updated 9 months ago
Alternatives and similar repositories for Universal-Geometry-with-ICA
Users that are interested in Universal-Geometry-with-ICA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- DIRECT: Direct and Indirect REsponses in Conversational Text Corpus☆17Jul 1, 2021Updated 4 years ago
- script to evaluate pre-trained Japanese word2vec model on Japanese similarity dataset☆12Nov 4, 2024Updated last year
- Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment☆38Sep 13, 2023Updated 2 years ago
- DefSent: Sentence Embeddings using Definition Sentences☆23Aug 5, 2021Updated 4 years ago
- LaTeX document class for the proceedings of ANLP☆21Oct 28, 2025Updated 4 months ago
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)☆10May 31, 2024Updated last year
- Embedding language models in probability space via log-likelihood vectors☆16Oct 25, 2025Updated 4 months ago
- The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.☆124Nov 13, 2025Updated 4 months ago
- ☆19Dec 6, 2024Updated last year
- ☆17May 31, 2023Updated 2 years ago
- Flexible evaluation tool for language models☆58Updated this week
- Tokyo Metropolitan University Paraphrase Corpus (TMUP)☆11Jun 12, 2017Updated 8 years ago
- The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)☆87Mar 16, 2026Updated last week
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Apr 9, 2024Updated last year
- Exploring Japanese SimCSE☆69Oct 31, 2023Updated 2 years ago
- NAISTの入試で提出した小論文☆33Jan 27, 2023Updated 3 years ago
- 🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. (Python wrapper for daachorse)☆20Mar 15, 2025Updated last year
- Source code for "SimCKP: Simple Contrastive Learning of Keyphrase Representations", Findings of EMNLP 2023☆12Jun 20, 2025Updated 9 months ago
- AJIMEE-Bench (Advanced Japanese IME Evaluation Benchmark)☆18Jan 13, 2025Updated last year
- SQL linter tool for BigQuery GoogleSQL (formerly known as StandardSQL).☆17Mar 15, 2026Updated last week
- To be readable without enhancing english power.☆10Jul 22, 2020Updated 5 years ago
- Code for "Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem" (NAACL 2022)☆111May 14, 2025Updated 10 months ago
- Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.☆89Nov 3, 2023Updated 2 years ago
- [CIKM 2022] Towards Automated Over-Sampling for Imbalanced Classification☆10Mar 20, 2023Updated 3 years ago
- Rotated Word Vector Representations and their Interpretability (EMNLP 2017)☆18Jul 13, 2019Updated 6 years ago
- ☆19Mar 12, 2026Updated last week
- Arguments parser with class for Python, inspired by StructOpt☆62Sep 17, 2023Updated 2 years ago
- ☆10Apr 26, 2023Updated 2 years ago
- ☆23Sep 18, 2020Updated 5 years ago
- Iterative heuristics for endogenous spatial regimes delineation (IJGIS 2023)☆11Dec 27, 2024Updated last year
- JQaRA: Japanese Question Answering with Retrieval Augmentation - 検索拡張(RAG)評価のための日本語Q&Aデータセット☆43Sep 9, 2025Updated 6 months ago
- ☆15Mar 15, 2022Updated 4 years ago
- 🦀 A Rust implementation of a RoBERTa classification model for the SNLI dataset☆13Sep 13, 2021Updated 4 years ago
- Kyoto University Web Document Leads Corpus☆83Dec 18, 2023Updated 2 years ago
- ☆11Feb 8, 2023Updated 3 years ago
- Experimental implementations of several (over/under)-sampling techniques not yet available in the imbalanced-learn library.☆12May 8, 2023Updated 2 years ago
- Japanese word embedding with Sudachi and NWJC 🌿☆171Mar 1, 2024Updated 2 years ago
- HPYLMのC++実装☆11May 2, 2017Updated 8 years ago
- A curated list of papers on click-through-rate (CTR) prediction.☆17Mar 3, 2024Updated 2 years ago