BatsResearch / LexC-GenLinks
Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.
☆18Updated last year
Alternatives and similar repositories for LexC-Gen
Users that are interested in LexC-Gen are comparing it to the libraries listed below
Sorting:
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆70Updated last year
- Utility for behavioral and representational analyses of Language Models☆167Updated last month
- A reading list of up-to-date papers on NLP for Social Good.☆304Updated 2 years ago
- ☆220Updated 3 months ago
- Find informative examples to efficiently (human)-evaluate NLG models.☆16Updated 3 weeks ago
- The Benchmark of Linguistic Minimal Pairs☆156Updated 2 years ago
- A simple library for querying the URIEL typological database.☆92Updated last year
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆119Updated 3 weeks ago
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆93Updated 9 months ago
- The central repo for Creole based NLU and NLG work☆18Updated 6 months ago
- Crosslingual Reasoning through Test-Time Scaling☆19Updated 5 months ago
- ☆17Updated 2 years ago
- Data for evaluating gender bias in coreference resolution systems.☆80Updated 6 years ago
- We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …☆187Updated 3 years ago
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆93Updated last year
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆228Updated 3 months ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆189Updated 4 years ago
- A Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation, Levy et al., Findings of EMNLP 2021☆14Updated 3 years ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆97Updated 2 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆106Updated last year
- German Alpaca Dataset (Cleaned + Translated)☆26Updated 2 years ago
- Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper☆403Updated last year
- The geometry of multilingual language model representations (EMNLP 2022).☆22Updated 3 years ago
- ☆55Updated 3 years ago
- A collection of text simplification datasets and other resources☆50Updated last year
- Interpretability for sequence generation models 🐛 🔍☆444Updated last week
- A Multilingual Replicable Instruction-Following Model☆95Updated 2 years ago
- The FLORES+ Machine Translation Benchmark☆108Updated 11 months ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆20Updated 7 months ago
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆14Updated this week