BatsResearch / LexC-GenLinks
Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.
☆15Updated 8 months ago
Alternatives and similar repositories for LexC-Gen
Users that are interested in LexC-Gen are comparing it to the libraries listed below
Sorting:
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆72Updated last year
- A system for prompted weak supervision. Alfred is a powerful tool that leverages large language models to accelerate data annotation.☆55Updated 2 months ago
- ☆44Updated 2 years ago
- ☆14Updated last year
- ☆26Updated last week
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆48Updated 3 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆101Updated last year
- Find informative examples to efficiently (human)-evaluate NLG models.☆11Updated last week
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆58Updated last year
- A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.☆27Updated 8 months ago
- ☆35Updated 11 months ago
- Crosslingual Reasoning through Test-Time Scaling☆17Updated 3 weeks ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Updated last month
- Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in …☆29Updated 3 years ago
- A curated list of research papers and resources on Cultural LLM.☆44Updated 8 months ago
- negate_sentence(A Python module that doesn't negate sentences.)☆31Updated 7 months ago
- A repository with several curated datasets of counter-narratives to fight online hate speech.☆89Updated last year
- Framework for unified summarisation and evaluation of English documents using state-of-the-art models and measures.☆32Updated last year
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Updated 2 years ago
- Crosslingual Question Answering for African Languages☆30Updated 8 months ago
- A lightweight Python library for constructing, processing, and visualizing constituent trees.☆66Updated 4 months ago
- Semantically Structured Sentence Embeddings☆66Updated 7 months ago
- LTG-Bert☆33Updated last year
- ☆27Updated 3 months ago
- The evaluation pipeline for the 2024 BabyLM Challenge.☆31Updated 6 months ago
- Resources for cultural NLP research☆95Updated last month
- The geometry of multilingual language model representations (EMNLP 2022).☆21Updated 2 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆55Updated 2 years ago
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆84Updated last year
- A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.☆82Updated 4 months ago