[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
☆118Jul 25, 2023Updated 2 years ago
Alternatives and similar repositories for COCO-LM
Users that are interested in COCO-LM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Mar 14, 2022Updated 4 years ago
- Bootstrapped Unsupervised Sentence Representation Learning (ACL 2021)☆30Apr 27, 2022Updated 3 years ago
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆46Mar 3, 2021Updated 5 years ago
- Train large COMET (T5-3B/GPT2-XL) with small memory (on 11GB memory GPUs like 1080/2080) using DeepSpeed.☆14Jan 23, 2022Updated 4 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆26Jul 26, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ALIGNIE: Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation and Instance Generation☆20Dec 12, 2022Updated 3 years ago
- Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)☆331Jan 10, 2024Updated 2 years ago
- [ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723☆729Aug 29, 2022Updated 3 years ago
- Implementation of Mixout with PyTorch☆75Dec 21, 2022Updated 3 years ago
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆34Nov 21, 2021Updated 4 years ago
- The implementation of DeBERTa☆2,205Sep 29, 2023Updated 2 years ago
- [EMNLP 2022] Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning☆136Nov 17, 2023Updated 2 years ago
- The source code used for paper "Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts", in WSDM 2023.☆14May 27, 2023Updated 2 years ago
- A curated list of resources dedicated to NLP (paper, blogs, note and etc)☆13Nov 30, 2019Updated 6 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Consistent dialogue generation☆16Oct 26, 2022Updated 3 years ago
- [EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training☆65Nov 12, 2021Updated 4 years ago
- [ICLR 2022] Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners☆130Dec 7, 2022Updated 3 years ago
- Source code for paper: Knowledge Inheritance for Pre-trained Language Models☆38Apr 24, 2022Updated 3 years ago
- generative-camouflaged-spam-detector☆11Aug 20, 2020Updated 5 years ago
- Official code repository for the main conference paper in EMNLP 2022: SubeventWriter: Iterative Sub-event Sequence Generation with Cohere…☆11Oct 16, 2022Updated 3 years ago
- [EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674☆195Jun 14, 2023Updated 2 years ago
- BANG is a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation. AR and NAR generat…☆28Feb 6, 2022Updated 4 years ago
- [Findings of EMNLP22] From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models☆19Mar 16, 2023Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- 3rd Place solution for Feedback Prize - Predicting Effective Arguments Kaggle competition☆16Sep 6, 2022Updated 3 years ago
- ☆99Jul 7, 2020Updated 5 years ago
- PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an…☆284Oct 20, 2022Updated 3 years ago
- ☆24Oct 23, 2020Updated 5 years ago
- 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Apr 6, 2025Updated last year
- The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to o…☆378Apr 21, 2023Updated 2 years ago
- This repository contains the code for applying One-Token Approximation to a pretrained language model using subword-level tokenization.☆11May 7, 2020Updated 5 years ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆78Dec 29, 2025Updated 3 months ago
- ☆54Jan 18, 2023Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821☆3,648Oct 16, 2024Updated last year
- Meta Representation Transformation for Low-resource Cross-lingual Learning☆41May 5, 2021Updated 4 years ago
- ☆290Dec 2, 2022Updated 3 years ago
- atmaCup #11 の Public 4th / Private 5th Solution のリポジトリです。☆12Aug 3, 2021Updated 4 years ago
- ☆36Aug 25, 2022Updated 3 years ago
- Official code repository for Findings of EMNLP 2022 paper: PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Popula…☆11Oct 18, 2022Updated 3 years ago
- ☆67May 11, 2024Updated last year