BatsResearch / LexC-GenLinks
Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.
☆18Updated last year
Alternatives and similar repositories for LexC-Gen
Users that are interested in LexC-Gen are comparing it to the libraries listed below
Sorting:
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆94Updated 10 months ago
- Find informative examples to efficiently (human)-evaluate NLG models.☆17Updated 2 weeks ago
- Resources for cultural NLP research☆112Updated 2 months ago
- The FLORES+ Machine Translation Benchmark☆109Updated last year
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆121Updated 2 months ago
- ☆224Updated 4 months ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆106Updated last year
- Utility for behavioral and representational analyses of Language Models☆173Updated last week
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆71Updated last year
- A Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation, Levy et al., Findings of EMNLP 2021☆14Updated 3 years ago
- A repository with several curated datasets of counter-narratives to fight online hate speech.☆93Updated 4 months ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆34Updated 9 months ago
- Data for evaluating gender bias in coreference resolution systems.☆81Updated 6 years ago
- A reading list of up-to-date papers on NLP for Social Good.☆304Updated 2 years ago
- ☆17Updated 2 years ago
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆97Updated last year
- ☆23Updated 4 years ago
- The geometry of multilingual language model representations (EMNLP 2022).☆22Updated 3 years ago
- ☆176Updated last year
- ☆115Updated 2 months ago
- SemEval2024-task8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection☆78Updated last year
- ☆102Updated last year
- We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …☆188Updated 3 years ago
- NTREX -- News Test References for MT Evaluation☆86Updated last year
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆16Updated last month
- German Alpaca Dataset (Cleaned + Translated)☆26Updated 2 years ago
- ☆55Updated 3 years ago
- Crosslingual Reasoning through Test-Time Scaling☆19Updated 7 months ago
- ☆118Updated last year
- ☆16Updated 3 years ago