The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!
☆378Apr 21, 2023Updated 3 years ago
Alternatives and similar repositories for DeCLUTR
Users that are interested in DeCLUTR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2021] Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning☆17Jun 28, 2025Updated 10 months ago
- ☆79Jul 11, 2022Updated 3 years ago
- A python tool for evaluating the quality of sentence embeddings.☆2,108Mar 19, 2024Updated 2 years ago
- [EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821☆3,651Oct 16, 2024Updated last year
- [ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.o…☆606Jun 15, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- State of the art Semantic Sentence Embeddings☆100May 22, 2022Updated 3 years ago
- [EMNLP 2021] Improving and Simplifying Pattern Exploiting Training☆152Jun 10, 2022Updated 3 years ago
- Code for EMNLP 2021 paper: Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting☆17Nov 30, 2021Updated 4 years ago
- This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"☆1,626Jun 12, 2023Updated 2 years ago
- Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer☆541Dec 10, 2021Updated 4 years ago
- EMNLP 2021 - Pre-training architectures for dense retrieval☆256Mar 18, 2022Updated 4 years ago
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆174Jun 6, 2021Updated 4 years ago
- [ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723☆728Aug 29, 2022Updated 3 years ago
- Code associated with the Don't Stop Pretraining ACL 2020 paper☆542Nov 15, 2021Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Code for our ACL '20 paper "Representation Engineering with Natural Language Explanations"☆29Jun 15, 2020Updated 5 years ago
- SPECTER: Document-level Representation Learning using Citation-informed Transformers☆579Jun 12, 2023Updated 2 years ago
- Adversarial Natural Language Inference Benchmark☆399May 12, 2022Updated 4 years ago
- Library for Knowledge Intensive Language Tasks☆973Mar 31, 2022Updated 4 years ago
- ☆12Feb 14, 2023Updated 3 years ago
- [NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240☆168Oct 7, 2022Updated 3 years ago
- NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations☆788May 19, 2024Updated 2 years ago
- Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)☆331Jan 10, 2024Updated 2 years ago
- Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"☆298Oct 27, 2022Updated 3 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- State-of-the-Art Embeddings, Retrieval, and Reranking☆18,669May 12, 2026Updated last week
- EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering☆68Nov 26, 2021Updated 4 years ago
- docTTTTTquery document expansion model☆375Mar 25, 2023Updated 3 years ago
- Efficient few-shot learning with Sentence Transformers☆2,735Apr 17, 2026Updated last month
- XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 ty…☆652Jan 4, 2023Updated 3 years ago
- TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs…☆3,417Apr 17, 2026Updated last month
- A novel embedding training algorithm leveraging ANN search and achieved SOTA retrieval on Trec DL 2019 and OpenQA benchmarks☆385Jan 6, 2026Updated 4 months ago
- [NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining☆118Jul 25, 2023Updated 2 years ago
- A simple semantic search engine for scientific papers.☆28Sep 14, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆359Feb 22, 2022Updated 4 years ago
- Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)☆464Nov 5, 2022Updated 3 years ago
- Lexically Error Correction BERT.☆49Jun 20, 2021Updated 4 years ago
- Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations☆133Apr 8, 2026Updated last month
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coher…☆1,268Jul 24, 2025Updated 9 months ago
- ☆344Aug 3, 2021Updated 4 years ago
- Longformer: The Long-Document Transformer☆2,195Feb 8, 2023Updated 3 years ago