Discovering Universal Geometry in Embeddings with ICA (Published in EMNLP 2023)
☆20Jun 17, 2025Updated 8 months ago
Alternatives and similar repositories for Universal-Geometry-with-ICA
Users that are interested in Universal-Geometry-with-ICA are comparing it to the libraries listed below
Sorting:
- script to evaluate pre-trained Japanese word2vec model on Japanese similarity dataset☆12Nov 4, 2024Updated last year
- DIRECT: Direct and Indirect REsponses in Conversational Text Corpus☆17Jul 1, 2021Updated 4 years ago
- DefSent: Sentence Embeddings using Definition Sentences☆22Aug 5, 2021Updated 4 years ago
- ☆17May 31, 2023Updated 2 years ago
- Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment☆38Sep 13, 2023Updated 2 years ago
- LaTeX document class for the proceedings of ANLP☆21Oct 28, 2025Updated 4 months ago
- Tokyo Metropolitan University Paraphrase Corpus (TMUP)☆11Jun 12, 2017Updated 8 years ago
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)☆10May 31, 2024Updated last year
- AJIMEE-Bench (Advanced Japanese IME Evaluation Benchmark)☆18Jan 13, 2025Updated last year
- The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)☆84Jan 6, 2026Updated last month
- To be readable without enhancing english power.☆10Jul 22, 2020Updated 5 years ago
- Source code for "SimCKP: Simple Contrastive Learning of Keyphrase Representations", Findings of EMNLP 2023☆12Jun 20, 2025Updated 8 months ago
- ☆11Aug 26, 2021Updated 4 years ago
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Apr 9, 2024Updated last year
- The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.☆125Nov 13, 2025Updated 3 months ago
- Flexible evaluation tool for language models☆58Updated this week
- NAISTの入試で提出した小論文☆33Jan 27, 2023Updated 3 years ago
- ☆19Dec 6, 2024Updated last year
- A soft and fast pattern matcher for billion-scale corpora.☆75Feb 26, 2025Updated last year
- ☆15Mar 15, 2022Updated 3 years ago
- 🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. (Python wrapper for daachorse)☆20Mar 15, 2025Updated 11 months ago
- JQaRA: Japanese Question Answering with Retrieval Augmentation - 検索拡張(RAG)評価のための日本語Q&Aデータセット☆42Sep 9, 2025Updated 5 months ago
- Arguments parser with class for Python, inspired by StructOpt☆62Sep 17, 2023Updated 2 years ago
- ☆16Jan 3, 2025Updated last year
- Exploring Japanese SimCSE☆69Oct 31, 2023Updated 2 years ago
- SQL linter tool for BigQuery GoogleSQL (formerly known as StandardSQL).☆17Updated this week
- Scripts for creating a Japanese-English parallel corpus and training NMT models☆18Nov 9, 2021Updated 4 years ago
- A simple implementation of SimCSE☆78Oct 31, 2022Updated 3 years ago
- A library for semantic similarity search☆26Jan 31, 2025Updated last year
- [ACL2023] WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings☆18Sep 12, 2023Updated 2 years ago
- Easy-to-use scripts to fine-tune GPT-2-JA with your own texts, to generate sentences, and to tweet them automatically.☆19Aug 26, 2025Updated 6 months ago
- Repository for JSICK☆45May 31, 2023Updated 2 years ago
- ☆19May 23, 2024Updated last year
- 敬語変換タスクにおける評価用データセット☆21Nov 24, 2022Updated 3 years ago
- Rotated Word Vector Representations and their Interpretability (EMNLP 2017)☆18Jul 13, 2019Updated 6 years ago
- Kyoto University Web Document Leads Corpus☆83Dec 18, 2023Updated 2 years ago
- Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.☆89Nov 3, 2023Updated 2 years ago
- ☆22Jan 6, 2023Updated 3 years ago
- python版日本語意味役割付与システム(ASA)☆22Nov 11, 2022Updated 3 years ago