BatsResearch / LexC-Gen
Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.
☆14Updated 3 months ago
Alternatives and similar repositories for LexC-Gen:
Users that are interested in LexC-Gen are comparing it to the libraries listed below
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆78Updated 9 months ago
- Statistics on multilingual datasets☆17Updated 2 years ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆47Updated 2 years ago
- ☆28Updated 6 months ago
- CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Switching☆18Updated 3 years ago
- ☆31Updated last year
- A curated list of research papers and resources on Cultural LLM.☆32Updated 3 months ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆96Updated last month
- LTG-Bert☆29Updated last year
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 2 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆99Updated 8 months ago
- ☆23Updated last month
- Code for the paper "Measuring Bias in Contextualized Word Representations"☆35Updated 5 years ago
- ☆44Updated 2 years ago
- Semantically Structured Sentence Embeddings☆66Updated 3 months ago
- Multilingual Open Text☆25Updated 2 months ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆30Updated 2 years ago
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆25Updated 4 months ago
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆44Updated 9 months ago
- Resources for cultural NLP research☆77Updated last month
- NTREX -- News Test References for MT Evaluation☆80Updated 7 months ago
- A Python Commonsense Knowledge Inference Toolkit☆63Updated last year
- A list of ethics related resources for researchers and practitioners of Natural Language Processing and Computational Linguistics☆31Updated last year
- Rationales for Sequential Predictions☆40Updated 2 years ago
- XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning☆101Updated 3 years ago
- ☆16Updated 3 years ago
- A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.☆71Updated 5 months ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆70Updated 10 months ago
- ☆21Updated 8 months ago
- This repository contains code and data for the EMNLP 2022 paper "CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about…☆10Updated 2 years ago