BatsResearch / LexC-GenLinks
Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.
☆16Updated 10 months ago
Alternatives and similar repositories for LexC-Gen
Users that are interested in LexC-Gen are comparing it to the libraries listed below
Sorting:
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆84Updated 6 months ago
- Utility for behavioral and representational analyses of Language Models☆155Updated last month
- ☆17Updated 2 years ago
- ☆167Updated last year
- ☆217Updated last week
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆70Updated last year
- Resources for cultural NLP research☆101Updated 3 months ago
- Find informative examples to efficiently (human)-evaluate NLG models.☆15Updated last month
- A reading list of up-to-date papers on NLP for Social Good.☆303Updated last year
- ☆54Updated 3 years ago
- ☆36Updated 10 months ago
- Interpretability for sequence generation models 🐛 🔍☆432Updated 3 months ago
- Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper☆400Updated last year
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆111Updated 4 months ago
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆92Updated last year
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆147Updated 2 months ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆103Updated last year
- Multilingual Large Language Models Evaluation Benchmark☆128Updated 11 months ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Updated 4 months ago
- ☆99Updated last year
- MAFAND-MT☆57Updated last year
- A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.☆102Updated last year
- SemEval2024-task8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection☆76Updated last year
- The Benchmark of Linguistic Minimal Pairs☆151Updated 2 years ago
- ☆66Updated 2 years ago
- This repository contains the code for "Generating Datasets with Pretrained Language Models".☆188Updated 3 years ago
- This repository provides details and links to the ACL anthology corpus/collection including .bib, .pdf and grobid extractions of the pdfs☆182Updated last year
- The geometry of multilingual language model representations (EMNLP 2022).☆21Updated 2 years ago
- OpenNyAI is a mission aimed at developing open source software and datasets to catalyze the creation of AI-powered solutions to improve a…☆41Updated last year
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆97Updated last year