dadelani / sib-200
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
☆16Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for sib-200
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆32Updated 9 months ago
- Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 🐑 🐑☆13Updated 7 months ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆70Updated 8 months ago
- Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …☆71Updated last year
- ☆24Updated 5 months ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆96Updated 7 months ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆31Updated 2 years ago
- NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings☆53Updated 5 months ago
- ☆23Updated last year
- An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation☆26Updated 5 months ago
- A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.☆70Updated 3 months ago
- A Multilingual Replicable Instruction-Following Model☆94Updated last year
- EMNLP2022 "Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment"☆16Updated last year
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆30Updated last year
- [TMLR'23] Contrastive Search Is What You Need For Neural Text Generation☆118Updated last year
- PyTorch reimplementation of REALM and ORQA☆22Updated 2 years ago
- ☆19Updated last year
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆56Updated 5 months ago
- Code for "Improving Translation Faithfulness of Large Language Models via Augmenting Instructions"☆12Updated last year
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆40Updated 11 months ago
- ☆10Updated 2 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆29Updated last year
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆28Updated 5 months ago
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- The original Backpack Language Model implementation, a fork of FlashAttention☆64Updated last year
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆17Updated last week
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Updated last year
- ☆20Updated 2 years ago
- The geometry of multilingual language model representations (EMNLP 2022).☆15Updated 2 years ago
- ☆25Updated 2 years ago