MilaNLProc / language-invariant-properties
☆22Updated 3 years ago
Alternatives and similar repositories for language-invariant-properties:
Users that are interested in language-invariant-properties are comparing it to the libraries listed below
- Source code and data for Like a Good Nearest Neighbor☆28Updated 3 months ago
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- Semantically Structured Sentence Embeddings☆66Updated 6 months ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Updated 3 years ago
- ☆11Updated 4 months ago
- ☆19Updated 2 years ago
- RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!☆41Updated 2 years ago
- Combining encoder-based language models☆11Updated 3 years ago
- Code for the paper: Saying No is An Art: Contextualized Fallback Responses for Unanswerable Dialogue Queries☆19Updated 3 years ago
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20Updated 2 years ago
- ☆76Updated 3 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Updated 3 years ago
- Statistics on multilingual datasets☆17Updated 2 years ago
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- Code for the paper SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021). https://openreview.net/forum?id=OF…☆29Updated 3 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Updated 3 weeks ago
- Embedding Recycling for Language models☆38Updated last year
- Using short models to classify long texts☆21Updated 2 years ago
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16Updated 3 years ago
- Generate BERT vocabularies and pretraining examples from Wikipedias☆18Updated 4 years ago
- ☆21Updated 3 years ago
- ☆13Updated 4 years ago
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆19Updated 3 months ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 3 years ago
- Code for equipping pretrained language models (BART, GPT-2, XLNet) with commonsense knowledge for generating implicit knowledge statement…☆16Updated 3 years ago
- This repository contains code and data for the EMNLP 2022 paper "CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about…☆10Updated 2 years ago
- A embed able annotation tool for end to end cross document co-reference☆42Updated 2 years ago
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.☆24Updated last year
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago