SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)
☆16Jul 27, 2024Updated last year
Alternatives and similar repositories for SCT
Users that are interested in SCT are comparing it to the libraries listed below
Sorting:
- Implementation of ConGen: Unsupervised Control and Generalization Distillation For Sentence Representation (Finding of EMNLP 2022).☆22Sep 13, 2023Updated 2 years ago
- A comprehensive evaluation framework for the SEA region☆19Feb 16, 2026Updated 2 weeks ago
- 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Apr 6, 2025Updated 10 months ago
- Benchmark for Thai sentence representation☆132May 27, 2025Updated 9 months ago
- triple-encoders is a library for contextualizing distributed Sentence Transformers representations.☆15Sep 3, 2024Updated last year
- The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGI…☆16May 4, 2022Updated 3 years ago
- ☆14Dec 13, 2023Updated 2 years ago
- The implementation of CL-ReLKT (NAACL-2022)☆14Aug 31, 2022Updated 3 years ago
- Code base for the EMNLP 2021 Findings paper: Cartography Active Learning☆14Jun 3, 2025Updated 9 months ago
- ☆21Dec 30, 2022Updated 3 years ago
- data collator for UL2 and U-PaLM☆29Aug 20, 2023Updated 2 years ago
- Code for paper Document-Level Paraphrase Generation with Sentence Rewriting and Reordering by Zhe Lin, Yitao Cai and Xiaojun Wan. This pa…☆26Nov 10, 2021Updated 4 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- Few-shot Learning with Auxiliary Data☆31Dec 8, 2023Updated 2 years ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- Language Models as Hierarchy Encoders☆39Jan 6, 2026Updated last month
- State-of-the-art paired encoder and decoder models (17M-1B params)☆59Aug 6, 2025Updated 6 months ago
- Thank you BART! Rewarding Pre-Trained Models Improves Formality Style Transfer (ACL 2021)☆30Oct 25, 2022Updated 3 years ago
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- ☆40Feb 1, 2023Updated 3 years ago
- ☆10May 1, 2025Updated 10 months ago
- ☆10Oct 2, 2024Updated last year
- Dataset Catalogue Homepage for Indonesian Languages☆10Feb 19, 2024Updated 2 years ago
- Token-free Language Modeling with ByGPT5 & Friends!☆12Jul 18, 2025Updated 7 months ago
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 2 years ago
- Linear Attention for Efficient Bidirectional Sequence Modeling☆15May 13, 2025Updated 9 months ago
- The official implementation of the paper "Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset"(ICASSP 2…☆12Feb 19, 2023Updated 3 years ago
- A set of extern classes for Haxe that wrap the Titanium API (http://developer.appcelerator.com/).☆16Jun 7, 2011Updated 14 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Apr 26, 2023Updated 2 years ago
- A extension of Transformers library to include T5ForSequenceClassification class.☆40Apr 17, 2023Updated 2 years ago
- Khmer, Lao, Myanmar, and Thai word segmentation/breaking library and command line☆42Oct 22, 2023Updated 2 years ago
- ☆12Dec 7, 2022Updated 3 years ago
- python library for visualization string edit distance☆10Oct 15, 2021Updated 4 years ago
- A tool to make spelling Thai more convenient☆11Mar 30, 2024Updated last year
- ☆10Jan 4, 2017Updated 9 years ago
- mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models☆11Jan 19, 2024Updated 2 years ago
- A context-aware embedding similarity score☆11Aug 23, 2023Updated 2 years ago
- ☆13Nov 28, 2025Updated 3 months ago
- These are tools I cheated with the help of ChatGPT to help me with Penetration Testing and Red Teaming☆15Feb 24, 2024Updated 2 years ago