BatsResearch / LexC-GenLinks
Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.
☆16Updated 9 months ago
Alternatives and similar repositories for LexC-Gen
Users that are interested in LexC-Gen are comparing it to the libraries listed below
Sorting:
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆83Updated 5 months ago
- Resources for cultural NLP research☆98Updated 2 months ago
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆13Updated last week
- ☆35Updated 9 months ago
- A Multilingual Replicable Instruction-Following Model☆94Updated 2 years ago
- TimeLMs: Diachronic Language Models from Twitter☆108Updated last year
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆97Updated last year
- ☆14Updated last year
- Semantically Structured Sentence Embeddings☆66Updated 9 months ago
- ParaNames: A multilingual resource for parallel names☆34Updated last year
- ☆98Updated last year
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆28Updated 10 months ago
- MAFAND-MT☆57Updated last year
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆58Updated last year
- ☆165Updated last year
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆49Updated 3 years ago
- A curated list of research papers and resources on Cultural LLM.☆45Updated 9 months ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆103Updated last year
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆73Updated last year
- ☆66Updated last year
- Collection of NLP model explanations and accompanying analysis tools☆144Updated 2 years ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆184Updated last week
- ☆51Updated 2 years ago
- [EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"☆32Updated last month
- COMET for African languages☆10Updated 5 months ago
- Detecting Bias and ensuring Fairness in AI solutions☆98Updated 2 years ago
- ☆40Updated last year
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆70Updated last year
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆144Updated last month
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆55Updated last year